9.2 Nonlinear independent-component estimation
We now expand our view to nonlinear, albeit still invertible, transformations [6, 40, 7, 24].
In particular, consider a “generative function”
This change of variables is called a flow††margin: flow [40]. Let us still assume a factorial prior, Eq. 9.2, and furthermore that it does not depend on any parameters. Since the transformations are (by assumption) invertible, the change-of-variables formula still applies. Therefore, Eq. 9.3 still holds, but the Jacobian determinant of composed functions becomes the product of the individual Jacobian determinants:
with Jacobians given by
(For the sake of writing derivatives, we have named the argument of the
Since the generative function
Perhaps the most obvious limitation is to require that the transformations be “volume preserving”; that is, to require that Jacobian determinants are always unity [6]
This can be achieved, for example, by splitting a data vector into two parts, and requiring (1) that the flow at a particular step
… [[multiple layers of this]]
Now our loss is, as usual, the relative entropy.
With the “recognition functions”
(For concision, the Jacobians are written as a function directly of
The discriminative dual.
The model defined by Eq. 9.7, along with the loss in Eq. 9.8, has been called “nonlinear independent component analysis” (NICE) [6].
To see if the name is apposite, we employ our discriminative/generative duality, reinterpreting the minimization of the relative-entropy in Eq. 9.8 as a maximization of mutual information between the data,
In this case, the generative marginal given by the normalizing flow, Eq. 9.7, becomes
Despite the appearance of a normal distribution in this expression, this marginal distribution is certainly not normal—even though the generative prior,
[[Connection to HMC]]