ANOVA Decomposition¶
The *analysis of variances (ANOVA) decomposition* is well-defined for any square-integrable multidimensional function \(f: \mathbb{R}^N \to \mathbb{R}\). If the input variables \(\{x_0, \dots, x_{N-1}\}\) are independently distributed random variables, the ANOVA decomposition partitions the total variance of the model, \(\mathrm{Var}[f]\), as a sum of variances of orthogonal functions \(\mathrm{Var}[f_{\alpha}]\) for all possible subsets \(\alpha\) of the input variables. Each \(f_{\alpha}\) depends effectively on the variables contained in \(\alpha\) only, and is constant with respect to the rest.
[1]:
import torch
import tntorch as tn
N = 4
t = tn.rand([32]*N, ranks_tt=5)
Let’s compute all ANOVA terms in one single tensor network:
[2]:
anova = tn.anova_decomposition(t)
anova
[2]:
4D TT-Tucker tensor:
33 33 33 33
| | | |
32 32 32 32
(0) (1) (2) (3)
/ \ / \ / \ / \
1 5 5 5 1
This tensor anova
indexes all \(2^N\) functions \(f_{\alpha}\) of the ANOVA decomposition of \(f\), and we can access it using our tensor masks.
Reference: *”Sobol Tensor Trains for Global Sensitivity Analysis”*, R. Ballester-Ripoll, E. G. Paredes, R. Pajarola (2017).
Manipulating the Decomposition¶
For example, let’s keep all terms that do not interact with \(w\):
[3]:
x, y, z, w = tn.symbols(N)
anova_cut = tn.mask(anova, ~w)
We can undo the decomposition to obtain a regular tensor again:
[4]:
t_cut = tn.undo_anova_decomposition(anova_cut)
As expected, our truncated tensor t_cut
has become constant with respect to the fourth variable \(w\):
[5]:
t_cut[0, 0, 0, :].torch()
[5]:
tensor([6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559,
6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559,
6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559, 6.1559,
6.1559, 6.1559, 6.1559, 6.1559, 6.1559])
How much did we lose by making that variable unimportant?
[6]:
print('The truncated tensor accounts for {:g}% of the original variance.'.format(tn.var(t_cut) / tn.var(t) * 100))
The truncated tensor accounts for 53.3676% of the original variance.
… which is also what Sobol’s method gives us:
[7]:
tn.sobol(t, ~w) * 100
[7]:
tensor(53.3676)
or, equivalently,
[8]:
tn.sobol(t, tn.only(x | y | z)) * 100
[8]:
tensor(53.3676)
Note that the empty subfunction \(f_{\emptyset}\) is constant, and its value everywhere coincides with the original function’s mean:
[9]:
empty = tn.undo_anova_decomposition(tn.mask(anova, tn.none(N)))
print(tn.var(empty)) # It's a constant function
print(empty[0, 0, 0, 0], tn.mean(t)) # Coincides with the global mean
tensor(6.5831e-14)
tensor(8.3357) tensor(8.3357)
Also note that summing all ANOVA subfunctions results in the original function:
[10]:
all_summed = tn.undo_anova_decomposition(tn.mask(anova, tn.true(N)))
tn.relative_error(t, all_summed)
[10]:
tensor(1.8894e-08)
Truncating Dimensionality¶
The low-dimensional terms of the ANOVA decomposition (i.e. terms that depend on a few variables only) are usually the ones that play the most important role in many analytical and real-world models.
We will now approximate our original function as a sum of (at most) bivariate functions. To that end we will use a weight mask tensor.
[11]:
m = tn.weight_mask(N, [0, 1, 2]) # Keep tuples with zero, one or two '1's
print('We will keep only {:g} of the original ANOVA terms'.format(tn.sum(m)))
t_cut = tn.undo_anova_decomposition(tn.mask(tn.anova_decomposition(t), m))
print('Relative error after truncation: {}'.format(tn.relative_error(t, t_cut)))
We will keep only 11 of the original ANOVA terms
Relative error after truncation: 0.028525682742228935
Visualizing the ANOVA Decomposition¶
Let’s now restrict ourselves to \(N = 2\) so that we can easily plot the ANOVA subfunctions and get an idea how they look like.
[12]:
# We set up a smooth 2D tensor, sum of a few cosine wavefunctions
N = 2
t = tn.randn([64]*N, ranks_tt=4, ranks_tucker=3)
t.set_factors('dct')
t
[12]:
2D TT-Tucker tensor:
64 64
| |
3 3
(0) (1)
/ \ / \
1 4 1
The full function looks like this:
[13]:
import matplotlib.pyplot as plt
%matplotlib inline
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
fig = plt.figure()
ax = fig.gca(projection='3d')
X, Y = np.meshgrid(np.arange(t.shape[0]), np.arange(t.shape[1]), indexing='ij')
surf = ax.plot_surface(X, Y, t.numpy())
plt.title('Original function $f$')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
Next we will show the \(2^N = 4\) ANOVA subfunctions:
[14]:
x, y = tn.symbols(N)
anova = tn.anova_decomposition(t)
zlim = [t.numpy().min(), t.numpy().max()]
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(2, 2, 1, projection='3d')
ax.plot_surface(X, Y, tn.undo_anova_decomposition(tn.mask(anova, tn.none(N))).numpy()) # Equivalent to ~x & ~y
plt.xlabel('x')
plt.ylabel('y')
ax.set_zlim3d(zlim[0], zlim[1])
ax.set_title('$f_{\O}$')
ax = fig.add_subplot(2, 2, 2, projection='3d')
ax.plot_surface(X, Y, tn.undo_anova_decomposition(tn.mask(anova, tn.only(x))).numpy())
plt.xlabel('x')
plt.ylabel('y')
ax.set_zlim3d(zlim[0], zlim[1])
ax.set_title('$f_x$')
ax = fig.add_subplot(2, 2, 3, projection='3d')
ax.plot_surface(X, Y, tn.undo_anova_decomposition(tn.mask(anova, tn.only(y))).numpy())
plt.xlabel('x')
plt.ylabel('y')
ax.set_zlim3d(zlim[0], zlim[1])
ax.set_title('$f_y$')
ax = fig.add_subplot(2, 2, 4, projection='3d')
ax.plot_surface(X, Y, tn.undo_anova_decomposition(tn.mask(anova, tn.only(x & y))).numpy())
plt.xlabel('x')
plt.ylabel('y')
ax.set_zlim3d(zlim[0], zlim[1])
ax.set_title('$f_{xy}$')
plt.show()