Monday, October 5, 2015

On the unreasonable effectiveness of Friedrichs angles: rates of convergence of powers of some weird operators

Statement of the problem

Let $d$ and $c$ be positive integers and $q = dc$. Let $G$ be a $q$-by-$q$ positive semi-definite real matrix with eigenvalues all $\le 1$, and define the $q$-by-$2q$ matrix $A = [G\hspace{1em}\mathrm{I}_q - G]$, where $\text{I}_q$ is the $q$-by-$q$ identity matrix. Now, given a scalar $\kappa > 0$ and $d$ vectors $X_1, X_2, \ldots, X_d \in \mathbb{R}^c$, with $\|X_j\| \ne \kappa \; \forall j$, consider the $q$-by-$q$ block-diagonal matrix $D$ whose $j$th block, a $c$-by-$c$ matrix, is given by \begin{equation} D_j = \begin{cases}\text{I}_c - \frac{\kappa}{\|X_j\|}\text{proj}_{\langle X_j \rangle^\perp}, &\mbox{ if } \|X_j\| > \kappa,\\ 0, &\mbox{ otherwise,}\end{cases} \end{equation} where $\text{proj}_{\langle X_j \rangle^\perp} = \mathrm{I}_c - \mathrm{proj}_{\langle X_j\rangle} = \mathrm{I}_c - \frac{1}{\|X_j\|^2}X_jX_j^T$ is the projection onto the orthogonal complement of the $1$-dimensional subspace of $\mathbb{R}^c$ spanned by the vector $X_j$. Finally define the $2q$-by-$q$ matrix $B = [D\hspace{1em}\mathrm{I}_q-D]^T$. For example, if $c=1$, then each $D_j \in \{{0,1}\}$, a bit indicating whether $|X_j| > \kappa$, and $D$ is a diagonal matrix with $0$s and $1$s accordingly. Now define the $2q$-by-$2q$ matrix $F := BA$ (Or $F := AB$. To see this equivalence, note that $(BA)^{n+1} = B(AB)^nA$, for all $n \in \mathbb{N}$. Thus to study the "convergence" of $((BA)^n)_{n \in \mathbb{N}}$, it suffices to study the convergence of $((AB)^n)_{n \in \mathbb{N}}$. In fact, if $(AB)^n \rightarrow C$, then $(BA)^n \rightarrow BCA$.).

Ultimately, I'm interested in the rate of convergence (in the sense of Definition 2.1 of this paper) of the sequence of matrix powers $(F^n)_{n\in\mathbb{N}}$. I can easily bound the spectral norm $r(F) \le \|F\| \le \|A\| \le 1$, but this turns out to be not very useful in my situation (e.g $\|A\|$ can be $1$). If I can compute the eigenvalues of $F$ (e.g in terms of the eigenvalues of $G$), then there is a fair chance I can apply the results of this paper to get rates of convergence for the aforementioned sequence.

Question

Can one relate (via a formula, for example) the eigenvalues of $F$ in terms of the eigenvalues of $G$ ? How fast does iterating $AB$ or $BA$ converge ?

Partial solution for limiting case when $G$ is the projection onto a subspace

In my initial problem, $G$ is a function of a scalar parameter, and after careful inspection, it turns out that in the limit as this parameter goes to infinity, $G$ becomes the projection operator onto a subspace of $\mathbb{R}^q$, i.e $G = \mathrm{proj}_{\text{Im }K} = K(K^TK)^{-1}K^T$ for some full column-rank $q$-by-$p$ matrix $K$ (with $p \le q$). If in addition $c=1$, then $D$ too is a projection (precisely, it is a diagonal matrix with only $0$s and $1$s, as already explained further above) and one computes \begin{eqnarray} \begin{split} F &= AB = [G\hspace{1em}\mathrm{I}_q - G][D\hspace{1em}\mathrm{I}_q - D]^T = GD + (\mathrm{I}_q - G)(\mathrm{I}_q - D)\\ &= \mathrm{proj}_U\mathrm{proj}_V + \mathrm{proj}_{U^\perp}\mathrm{proj}_{V^\perp}, \end{split} \end{eqnarray} where $U := \text{Im }K$, and $V := \text{Im }D$ (note that ${U^\perp} = \text{ker }K^T$ and ${V^\perp} = \text{ker }D^T$). It then suffices to invoke Theorem 3.10 of this [paper][2], with $\mu = 1 \in [0, 2)$ to obtain that the sequence of matrix powers $(F^n)_{n \in \mathbb{N}}$ has the optimal rate of convergence $\gamma = \cos{\theta_F}(U, V) \in [0, 1)$, the cosine of the the *Friedrichs angle* $\theta_F(U, V) \in [0,\pi/2]$ between $U$ and $V$ defined by: \begin{equation} \cos{\theta_F}(U, V) := \sup\{\langle u, v \rangle | u \in U \cap (U \cap V)^\perp, \|u\| \le 1, v \in V \cap (U \cap V)^\perp, \|v\| \le 1 \}. \end{equation}

Final words

As a pathological example, if $K$ (resp. $D$) is invertible, then $F = \mathrm{proj}_U$ (resp. $F = \mathrm{proj}_V$), and so $\cos{\theta_F}(U, V) = 0$. Consequently, $(F^n)_{n \in \mathbb{N}}$ convergences exponentially fast (in fact, in $1$ iteration!) to $F$.