Processing math: 100%

4. Singular Value Decomposition

library(bage)

Derivation of standardised principal components

Data

Let YY be an A×L matrix of values, where a=1,,A indexes age group and l=1,,L indexes some combination of classifying variables, such as country crossed with time. The values are real numbers, including negative numbers, such as log-transformed rates, or logit-transformed probabilities.

Singular value decomposition

We perform a singular value decomposition on YY, and retain only the first C<A components, to obtain YYUUDDVV

UU is an A×C matrix whose columns are left singular vectors. DD is a C×C diagonal matrix holding the singular values. VV is a L×C matrix whose columns are right singular vectors.

Standardising

Let mmV be a vector, the cth element of which is the mean of the cth singular vector, Ll=1vlc/L. Similarly, let ssV be a vector, the cth element of which is the standard deviation of the cth singular vector, Ll=1(vlcmc)2/(L1). Then define MMV=11mmVSSV=diag(ssV), where 11 is an L-vector of ones. Let ~VV be a standardized version of VV, ~VV=(VVMMV)SS1V.

We can now express YY as YYUUDD(~VVSSV+MMV)=UUDDSSV~VV+UUDDMMV=AA~VV+BB.

Furthermore, we can express matrix BB as BB=UUDDMMV=UUDDmmV11$=bb11.

Result

Consider a randomly selected row ~vvl from ~VV. From the construction of ~VV, and the orthogonality of the columns of VV TODO-spell this out a bit more, we obtain E[~vvl]=00 and Var[~vvl]=II. This implies that if set yy=AAzz+bb where zzN(00,II), then yy will look like a randomly-chosen column from YY.

TODO - illustrate with examples