Enforcing consistency between exponential-family emission and posterior distributions.
We saw above that when the emission and posterior distributions are both in exponential families, the natural parameters are constrained by Eq. 3.1.
To simplify the presentation, we repeat the constraint here (with the vector-valued functions named alphabetically):
(3.2)
It is intuitive that this equation constrains the natural parameters (here, and ): no - interaction terms appear on the left-hand side, so those generated on the right must cancel.
This is particularly restrictive since the interactions are created only through inner products.
For example, if contains only terms quadratic in the elements of , then must contain such terms as well, in order to cancel them (except in the trivial case where is constant).
Let all the functions be polynomials in and of maximum degree , and define the monomial bases
(Notice that we have omitted the constants from these bases.)
For appropriately shaped matrices (), vectors (), and constant (), Eq. 3.2 is equivalent to the equation
holding for all values of and .
Therefore,
|
|
|
|
(3.3) |
|
|
|
|
|
|
|
|
|
|
|
|
We shall only make use of the last of these, Eq. 3.3.
Now assume and are “fat”—that is, : the monomial bases and have at least as many elements as the vector-valued functions and (resp.)—with linearly independent columns.
Then there exists a (tall) right pseudo-inverse for , call it , such that ; and a (tall) right pseudo-inverse for , call it , such that .
It follows immediately from the last of Eq. 3.3 that
(3.4)
where on the second line we have defined a new matrix .
This allows us to rewrite the functions and in terms of and (resp.):
(3.5)
In a word,
is an affine function of , and is an affine function of ; and the linear transformations are transposes of each other.