Processing math: 100%

Monday, October 5, 2015

On the unreasonable effectiveness of Friedrichs angles: rates of convergence of powers of some weird operators

Statement of the problem

Let d and c be positive integers and q = dc. Let G be a q-by-q positive semi-definite real matrix with eigenvalues all \le 1, and define the q-by-2q matrix A = [G\hspace{1em}\mathrm{I}_q - G], where \text{I}_q is the q-by-q identity matrix. Now, given a scalar \kappa > 0 and d vectors X_1, X_2, \ldots, X_d \in \mathbb{R}^c, with \|X_j\| \ne \kappa \; \forall j, consider the q-by-q block-diagonal matrix D whose jth block, a c-by-c matrix, is given by \begin{equation} D_j = \begin{cases}\text{I}_c - \frac{\kappa}{\|X_j\|}\text{proj}_{\langle X_j \rangle^\perp}, &\mbox{ if } \|X_j\| > \kappa,\\ 0, &\mbox{ otherwise,}\end{cases} \end{equation} where \text{proj}_{\langle X_j \rangle^\perp} = \mathrm{I}_c - \mathrm{proj}_{\langle X_j\rangle} = \mathrm{I}_c - \frac{1}{\|X_j\|^2}X_jX_j^T is the projection onto the orthogonal complement of the 1-dimensional subspace of \mathbb{R}^c spanned by the vector X_j. Finally define the 2q-by-q matrix B = [D\hspace{1em}\mathrm{I}_q-D]^T. For example, if c=1, then each D_j \in \{{0,1}\}, a bit indicating whether |X_j| > \kappa, and D is a diagonal matrix with 0s and 1s accordingly. Now define the 2q-by-2q matrix F := BA (Or F := AB. To see this equivalence, note that (BA)^{n+1} = B(AB)^nA, for all n \in \mathbb{N}. Thus to study the "convergence" of ((BA)^n)_{n \in \mathbb{N}}, it suffices to study the convergence of ((AB)^n)_{n \in \mathbb{N}}. In fact, if (AB)^n \rightarrow C, then (BA)^n \rightarrow BCA.).

Ultimately, I'm interested in the rate of convergence (in the sense of Definition 2.1 of this paper) of the sequence of matrix powers (F^n)_{n\in\mathbb{N}}. I can easily bound the spectral norm r(F) \le \|F\| \le \|A\| \le 1, but this turns out to be not very useful in my situation (e.g \|A\| can be 1). If I can compute the eigenvalues of F (e.g in terms of the eigenvalues of G), then there is a fair chance I can apply the results of this paper to get rates of convergence for the aforementioned sequence.

Question

Can one relate (via a formula, for example) the eigenvalues of F in terms of the eigenvalues of G ? How fast does iterating AB or BA converge ?

Partial solution for limiting case when G is the projection onto a subspace

In my initial problem, G is a function of a scalar parameter, and after careful inspection, it turns out that in the limit as this parameter goes to infinity, G becomes the projection operator onto a subspace of \mathbb{R}^q, i.e G = \mathrm{proj}_{\text{Im }K} = K(K^TK)^{-1}K^T for some full column-rank q-by-p matrix K (with p \le q). If in addition c=1, then D too is a projection (precisely, it is a diagonal matrix with only 0s and 1s, as already explained further above) and one computes \begin{eqnarray} \begin{split} F &= AB = [G\hspace{1em}\mathrm{I}_q - G][D\hspace{1em}\mathrm{I}_q - D]^T = GD + (\mathrm{I}_q - G)(\mathrm{I}_q - D)\\ &= \mathrm{proj}_U\mathrm{proj}_V + \mathrm{proj}_{U^\perp}\mathrm{proj}_{V^\perp}, \end{split} \end{eqnarray} where U := \text{Im }K, and V := \text{Im }D (note that {U^\perp} = \text{ker }K^T and {V^\perp} = \text{ker }D^T). It then suffices to invoke Theorem 3.10 of this [paper][2], with \mu = 1 \in [0, 2) to obtain that the sequence of matrix powers (F^n)_{n \in \mathbb{N}} has the optimal rate of convergence \gamma = \cos{\theta_F}(U, V) \in [0, 1), the cosine of the the *Friedrichs angle* \theta_F(U, V) \in [0,\pi/2] between U and V defined by: \begin{equation} \cos{\theta_F}(U, V) := \sup\{\langle u, v \rangle | u \in U \cap (U \cap V)^\perp, \|u\| \le 1, v \in V \cap (U \cap V)^\perp, \|v\| \le 1 \}. \end{equation}

Final words

As a pathological example, if K (resp. D) is invertible, then F = \mathrm{proj}_U (resp. F = \mathrm{proj}_V), and so \cos{\theta_F}(U, V) = 0. Consequently, (F^n)_{n \in \mathbb{N}} convergences exponentially fast (in fact, in 1 iteration!) to F.

No comments:

Post a Comment