A.1 Leave-one-out cross validation for linear regression in one step
As we have seen (Section LABEL:sec:), in vanilla linear regression, given a (fat) data matrix
Our predicted outputs are then:
where we have defined the so-called hat matrix:
If we define the “residual matrix”
then the residuals are likewise expressed succinctly:
The leave-out-out residuals.
Now, in each step of the leave-one-out procedure, one column of
for the regression coefficients at the
Let us write the prediction at the
where on the final line we use the fact, from Eq. A.4, that the diagonal entries of the residual matrix are
We can clean up this equation by giving names to the diagonal matrices constructed from the diagonals of
in which case
From the leave-out-out predictions it is simple to calculate the leave-one-out residuals:
Interpretation of the leave-out-out residuals.
Eqs. A.6 and A.7 are pleasingly elegant, and yield intuitive interpretations.
To begin with, note that in a standard linear regression, the element
Eq. A.6 tells us how to transform the standard predictions (
The diagonal elements of the hat and residual matrices.
It is easily verified that
Letting
which can only be satisfied when
Since the diagonal elements of the residual matrix,
-fold cross validation.
With a little effort, the analysis can be extended to more general folded cross validation, in which the folds can consist of more than one sample, although must still be disjoint.
The left out samples from fold
where we have given a name to the
Now, it is easily verified that the bracketed quantity on the final line is merely
Correspondingly, the
where we have named the
since a block diagonal matrix can be inverted by inverting its blocks.
Eq. A.9 looks reassuringly like Eq. A.7, but the resemblance is somewhat misleading:
In practice we would not (typically) invert