Mercurial > hg > octave-nkf
view doc/interpreter/diagperm.txi @ 16601:189241a7c3a9
maint: periodic merge of stable to default
author | Jordi Gutiérrez Hermoso <jordigh@octave.org> |
---|---|
date | Wed, 01 May 2013 15:29:57 -0400 |
parents | c3fd61c59e9c |
children | 12005245b645 |
line wrap: on
line source
@c Copyright (C) 2009-2012 Jaroslav Hajek @c @c This file is part of Octave. @c @c Octave is free software; you can redistribute it and/or modify it @c under the terms of the GNU General Public License as published by the @c Free Software Foundation; either version 3 of the License, or (at @c your option) any later version. @c @c Octave is distributed in the hope that it will be useful, but WITHOUT @c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or @c FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License @c for more details. @c @c You should have received a copy of the GNU General Public License @c along with Octave; see the file COPYING. If not, see @c <http://www.gnu.org/licenses/>. @node Diagonal and Permutation Matrices @chapter Diagonal and Permutation Matrices @menu * Basic Usage:: Creation and Manipulation of Diagonal and Permutation Matrices * Matrix Algebra:: Linear Algebra with Diagonal and Permutation Matrices * Function Support:: Functions That Are Aware of These Matrices * Example Code:: Some Examples of Usage * Zeros Treatment:: The Differences in Treatment of Zero Elements @end menu @node Basic Usage @section Creating and Manipulating Diagonal and Permutation Matrices A diagonal matrix is defined as a matrix that has zero entries outside the main diagonal; that is, @tex $D_{ij} = 0$ if $i \neq j$ @end tex @ifnottex @code{D(i,j) == 0} if @code{i != j}. @end ifnottex Most often, square diagonal matrices are considered; however, the definition can equally be applied to non-square matrices, in which case we usually speak of a rectangular diagonal matrix. A permutation matrix is defined as a square matrix that has a single element equal to unity in each row and each column; all other elements are zero. That is, there exists a permutation (vector) @tex $p$ such that $P_{ij}=1$ if $j = p_i$ and $P_{ij}=0$ otherwise. @end tex @ifnottex @code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and @code{P(i,j) == 0} otherwise. @end ifnottex Octave provides special treatment of real and complex rectangular diagonal matrices, as well as permutation matrices. They are stored as special objects, using efficient storage and algorithms, facilitating writing both readable and efficient matrix algebra expressions in the Octave language. @menu * Creating Diagonal Matrices:: * Creating Permutation Matrices:: * Explicit and Implicit Conversions:: @end menu @node Creating Diagonal Matrices @subsection Creating Diagonal Matrices The most common and easiest way to create a diagonal matrix is using the built-in function @dfn{diag}. The expression @code{diag (v)}, with @var{v} a vector, will create a square diagonal matrix with elements on the main diagonal given by the elements of @var{v}, and size equal to the length of @var{v}. @code{diag (v, m, n)} can be used to construct a rectangular diagonal matrix. The result of these expressions will be a special diagonal matrix object, rather than a general matrix object. Diagonal matrix with unit elements can be created using @dfn{eye}. Some other built-in functions can also return diagonal matrices. Examples include @dfn{balance} or @dfn{inv}. Example: @example diag (1:4) @result{} Diagonal Matrix 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 diag (1:3,5,3) @result{} Diagonal Matrix 1 0 0 0 2 0 0 0 3 0 0 0 0 0 0 @end example @node Creating Permutation Matrices @subsection Creating Permutation Matrices For creating permutation matrices, Octave does not introduce a new function, but rather overrides an existing syntax: permutation matrices can be conveniently created by indexing an identity matrix by permutation vectors. That is, if @var{q} is a permutation vector of length @var{n}, the expression @example P = eye (n) (:, q); @end example @noindent will create a permutation matrix - a special matrix object. @example eye (n) (q, :) @end example @noindent will also work (and create a row permutation matrix), as well as @example eye (n) (q1, q2). @end example For example: @example @group eye (4) ([1,3,2,4],:) @result{} Permutation Matrix 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 eye (4) (:,[1,3,2,4]) @result{} Permutation Matrix 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 @end group @end example Mathematically, an identity matrix is both diagonal and permutation matrix. In Octave, @code{eye (n)} returns a diagonal matrix, because a matrix can only have one class. You can convert this diagonal matrix to a permutation matrix by indexing it by an identity permutation, as shown below. This is a special property of the identity matrix; indexing other diagonal matrices generally produces a full matrix. @example @group eye (3) @result{} Diagonal Matrix 1 0 0 0 1 0 0 0 1 eye(3)(1:3,:) @result{} Permutation Matrix 1 0 0 0 1 0 0 0 1 @end group @end example Some other built-in functions can also return permutation matrices. Examples include @dfn{inv} or @dfn{lu}. @node Explicit and Implicit Conversions @subsection Explicit and Implicit Conversions The diagonal and permutation matrices are special objects in their own right. A number of operations and built-in functions are defined for these matrices to use special, more efficient code than would be used for a full matrix in the same place. Examples are given in further sections. To facilitate smooth mixing with full matrices, backward compatibility, and compatibility with @sc{matlab}, the diagonal and permutation matrices should allow any operation that works on full matrices, and will either treat it specially, or implicitly convert themselves to full matrices. Instances include matrix indexing, except for extracting a single element or a leading submatrix, indexed assignment, or applying most mapper functions, such as @dfn{exp}. An explicit conversion to a full matrix can be requested using the built-in function @dfn{full}. It should also be noted that the diagonal and permutation matrix objects will cache the result of the conversion after it is first requested (explicitly or implicitly), so that subsequent conversions will be very cheap. @node Matrix Algebra @section Linear Algebra with Diagonal and Permutation Matrices As has been already said, diagonal and permutation matrices make it possible to use efficient algorithms while preserving natural linear algebra syntax. This section describes in detail the operations that are treated specially when performed on these special matrix objects. @menu * Expressions Involving Diagonal Matrices:: * Expressions Involving Permutation Matrices:: @end menu @node Expressions Involving Diagonal Matrices @subsection Expressions Involving Diagonal Matrices Assume @var{D} is a diagonal matrix. If @var{M} is a full matrix, then @code{D*M} will scale the rows of @var{M}. That means, if @code{S = D*M}, then for each pair of indices i,j it holds @tex $$S_{ij} = D_{ii} M_{ij}$$ @end tex @ifnottex @example S(i,j) = D(i,i) * M(i,j). @end example @end ifnottex Similarly, @code{M*D} will do a column scaling. The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}. If @code{m < n}, then the expression @code{D*M} is equivalent to @example D(:,1:m) * M(1:m,:), @end example @noindent i.e., trailing @code{n-m} rows of @var{M} are ignored. If @code{m > n}, then @code{D*M} is equivalent to @example [D(1:n,n) * M; zeros(m-n, columns (M))], @end example @noindent i.e., null rows are appended to the result. The situation for right-multiplication @code{M*D} is analogous. The expressions @code{D \ M} and @code{M / D} perform inverse scaling. They are equivalent to solving a diagonal (or rectangular diagonal) in a least-squares minimum-norm sense. In exact arithmetic, this is equivalent to multiplying by a pseudoinverse. The pseudoinverse of a rectangular diagonal matrix is again a rectangular diagonal matrix with swapped dimensions, where each nonzero diagonal element is replaced by its reciprocal. The matrix division algorithms do, in fact, use division rather than multiplication by reciprocals for better numerical accuracy; otherwise, they honor the above definition. Note that a diagonal matrix is never truncated due to ill-conditioning; otherwise, it would not be much useful for scaling. This is typically consistent with linear algebra needs. A full matrix that only happens to be diagonal (an is thus not a special object) is of course treated normally. Multiplication and division by diagonal matrices works efficiently also when combined with sparse matrices, i.e., @code{D*S}, where @var{D} is a diagonal matrix and @var{S} is a sparse matrix scales the rows of the sparse matrix and returns a sparse matrix. The expressions @code{S*D}, @code{D\S}, @code{S/D} work analogically. If @var{D1} and @var{D2} are both diagonal matrices, then the expressions @example @group D1 + D2 D1 - D2 D1 * D2 D1 / D2 D1 \ D2 @end group @end example @noindent again produce diagonal matrices, provided that normal dimension matching rules are obeyed. The relations used are same as described above. Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or raised to a scalar power if it is square, producing diagonal matrix result in all cases. A diagonal matrix can also be transposed or conjugate-transposed, giving the expected result. Extracting a leading submatrix of a diagonal matrix, i.e., @code{D(1:m,1:n)}, will produce a diagonal matrix, other indexing expressions will implicitly convert to full matrix. Adding a diagonal matrix to a full matrix only operates on the diagonal elements. Thus, @example A = A + eps * eye (n) @end example @noindent is an efficient method of augmenting the diagonal of a matrix. Subtraction works analogically. When involved in expressions with other element-by-element operators, @code{.*}, @code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will take place. This is not always strictly necessary but chosen to facilitate better consistency with @sc{matlab}. @node Expressions Involving Permutation Matrices @subsection Expressions Involving Permutation Matrices If @var{P} is a permutation matrix and @var{M} a matrix, the expression @code{P*M} will permute the rows of @var{M}. Similarly, @code{M*P} will yield a column permutation. Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation. The previously described syntax for creating permutation matrices can actually help an user to understand the connection between a permutation matrix and a permuting vector. Namely, the following holds, where @code{I = eye (n)} is an identity matrix: @example I(p,:) * M = (I*M) (p,:) = M(p,:) @end example Similarly, @example M * I(:,p) = (M*I) (:,p) = M(:,p) @end example The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices. A permutation matrix can be transposed (or conjugate-transposed, which is the same, because a permutation matrix is never complex), inverting the permutation, or equivalently, turning a row-permutation matrix into a column-permutation one. For permutation matrices, transpose is equivalent to inversion, thus @code{P\M} is equivalent to @code{P'*M}. Transpose of a permutation matrix (or inverse) is a constant-time operation, flipping only a flag internally, and thus the choice between the two above equivalent expressions for inverse permuting is completely up to the user's taste. Multiplication and division by permutation matrices works efficiently also when combined with sparse matrices, i.e., @code{P*S}, where @var{P} is a permutation matrix and @var{S} is a sparse matrix permutes the rows of the sparse matrix and returns a sparse matrix. The expressions @code{S*P}, @code{P\S}, @code{S/P} work analogically. Two permutation matrices can be multiplied or divided (if their sizes match), performing a composition of permutations. Also a permutation matrix can be indexed by a permutation vector (or two vectors), giving again a permutation matrix. Any other operations do not generally yield a permutation matrix and will thus trigger the implicit conversion. @node Function Support @section Functions That Are Aware of These Matrices This section lists the built-in functions that are aware of diagonal and permutation matrices on input, or can return them as output. Passed to other functions, these matrices will in general trigger an implicit conversion. (Of course, user-defined dynamically linked functions may also work with diagonal or permutation matrices). @menu * Diagonal Matrix Functions:: * Permutation Matrix Functions:: @end menu @node Diagonal Matrix Functions @subsection Diagonal Matrix Functions @dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again a diagonal matrix. @dfn{det} will use an efficient straightforward calculation when given a diagonal matrix, as well as @dfn{cond}. The following mapper functions can be applied to a diagonal matrix without converting it to a full one: @dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}. A diagonal matrix can also be returned from the @dfn{balance} and @dfn{svd} functions. The @dfn{sparse} function will convert a diagonal matrix efficiently to a sparse matrix. @node Permutation Matrix Functions @subsection Permutation Matrix Functions @dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its specialness. @dfn{det} can be applied to a permutation matrix, efficiently calculating the sign of the permutation (which is equal to the determinant). A permutation matrix can also be returned from the built-in functions @dfn{lu} and @dfn{qr}, if a pivoted factorization is requested. The @dfn{sparse} function will convert a permutation matrix efficiently to a sparse matrix. The @dfn{find} function will also work efficiently with a permutation matrix, making it possible to conveniently obtain the permutation indices. @node Example Code @section Some Examples of Usage The following can be used to solve a linear system @code{A*x = b} using the pivoted LU@tie{}factorization: @example @group [L, U, P] = lu (A); ## now L*U = P*A x = U \ L \ P*b; @end group @end example @noindent This is one way to normalize columns of a matrix @var{X} to unit norm: @example @group s = norm (X, "columns"); X /= diag (s); @end group @end example @noindent The same can also be accomplished with broadcasting (@pxref{Broadcasting}): @example @group s = norm (X, "columns"); X ./= s; @end group @end example @noindent The following expression is a way to efficiently calculate the sign of a permutation, given by a permutation vector @var{p}. It will also work in earlier versions of Octave, but slowly. @example det (eye (length (p))(p, :)) @end example @noindent Finally, here's how you solve a linear system @code{A*x = b} with Tikhonov regularization (ridge regression) using SVD (a skeleton only): @example @group m = rows (A); n = columns (A); [U, S, V] = svd (A); ## determine the regularization factor alpha ## alpha = @dots{} ## transform to orthogonal basis b = U'*b; ## Use the standard formula, replacing A with S. ## S is diagonal, so the following will be very fast and accurate. x = (S'*S + alpha^2 * eye (n)) \ (S' * b); ## transform to solution basis x = V*x; @end group @end example @node Zeros Treatment @section The Differences in Treatment of Zero Elements Making diagonal and permutation matrices special matrix objects in their own right and the consequent usage of smarter algorithms for certain operations implies, as a side effect, small differences in treating zeros. The contents of this section applies also to sparse matrices, discussed in the following chapter. The IEEE standard defines the result of the expressions @code{0*Inf} and @code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the best compromise. Numerical software dealing with structured and sparse matrices (including Octave) however, almost always makes a distinction between a "numerical zero" and an "assumed zero". A "numerical zero" is a zero value occurring in a place where any floating-point value could occur. It is normally stored somewhere in memory as an explicit value. An "assumed zero", on the contrary, is a zero matrix element implied by the matrix structure (diagonal, triangular) or a sparsity pattern; its value is usually not stored explicitly anywhere, but is implied by the underlying data structure. The primary distinction is that an assumed zero, when multiplied by any number, or divided by any nonzero number, yields *always* a zero, even when, e.g., multiplied by @code{Inf} or divided by @code{NaN}. The reason for this behavior is that the numerical multiplication is not actually performed anywhere by the underlying algorithm; the result is just assumed to be zero. Equivalently, one can say that the part of the computation involving assumed zeros is performed symbolically, not numerically. This behavior not only facilitates the most straightforward and efficient implementation of algorithms, but also preserves certain useful invariants, like: @itemize @item scalar * diagonal matrix is a diagonal matrix @item sparse matrix / scalar preserves the sparsity pattern @item permutation matrix * matrix is equivalent to permuting rows @end itemize all of these natural mathematical truths would be invalidated by treating assumed zeros as numerical ones. Note that @sc{matlab} does not strictly follow this principle and converts assumed zeros to numerical zeros in certain cases, while not doing so in other cases. As of today, there are no intentions to mimic such behavior in Octave. Examples of effects of assumed zeros vs. numerical zeros: @example Inf * eye (3) @result{} Inf 0 0 0 Inf 0 0 0 Inf Inf * speye (3) @result{} Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%]) (1, 1) -> Inf (2, 2) -> Inf (3, 3) -> Inf Inf * full (eye (3)) @result{} Inf NaN NaN NaN Inf NaN NaN NaN Inf @end example @example @group diag (1:3) * [NaN; 1; 1] @result{} NaN 2 3 sparse (1:3,1:3,1:3) * [NaN; 1; 1] @result{} NaN 2 3 [1,0,0;0,2,0;0,0,3] * [NaN; 1; 1] @result{} NaN NaN NaN @end group @end example