octave-lyh: doc/interpreter/diagperm.txi comparison

comparison doc/interpreter/diagperm.txi @ 8839:fcba62cc4549

add chapter about diagonal and permutation matrices to manual

author	Jaroslav Hajek <highegg@gmail.com>
date	Mon, 23 Feb 2009 13:55:44 +0100
parents
children	d6de39523f03

comparison

equal deleted inserted replaced

-:dea5bd01e6d7
+:fcba62cc4549
+@c Copyright (C) 2009 Jaroslav Hajek
+@c
+@c This file is part of Octave.
+@c
+@c Octave is free software; you can redistribute it and/or modify it
+@c under the terms of the GNU General Public License as published by the
+@c Free Software Foundation; either version 3 of the License, or (at
+@c your option) any later version.
+@c
+@c Octave is distributed in the hope that it will be useful, but WITHOUT
+@c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+@c FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+@c for more details.
+@c
+@c You should have received a copy of the GNU General Public License
+@c along with Octave; see the file COPYING.  If not, see
+@c <http://www.gnu.org/licenses/>.
+@ifhtml
+@set htmltex
+@end ifhtml
+@iftex
+@set htmltex
+@end iftex
+@node Diagonal and Permutation Matrices
+@chapter Diagonal and Permutation Matrices
+@menu
+* Basic Usage::          Creation and Manipulation of Diagonal and Permutation Matrices
+* Matrix Algebra::   	 Linear Algebra with Diagonal and Permutation Matrices
+* Function Support::     Functions That Are Aware of These Matrices
+* Example Codes::        Some Examples of Usage
+* Zeros Treatment::      The Differences in Treatment of Zero Elements
+@end menu
+@node Basic Usage
+@section The Creation and Manipulation of Diagonal and Permutation Matrices
+A diagonal matrix is defined as a matrix that has zero entries outside the main diagonal;
+that is, @code{D(i,j) == 0} if @code{i != j}.
+Most often, square diagonal matrices are considered; however, the definition can equally
+be applied to nonsquare matrices, in which case we usually speak of a rectangular diagonal
+matrix. For more information, see @url{http://en.wikipedia.org/wiki/Diagonal_matrix}.
+A permutation matrix is defined as a square matrix that has a single element equal to unity
+in each row and each column; all other elements are zero. That is, there exists a
+permutation (vector) @code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and
+@code{P(i,j) == 0} otherwise.  For more information, see
+@url{http://en.wikipedia.org/wiki/Permutation_matrix}.
+Octave provides special treatment of real and complex rectangular diagonal matrices,
+as well as permutation matrices. They are stored as special objects, using efficient
+storage and algorithms, facilitating writing both readable and efficient matrix algebra
+expressions in the Octave language.
+@menu
+* Creating Diagonal Matrices::
+* Creating Permutation Matrices::
+* Explicit and Implicit Conversions::
+@end menu
+@node Creating Diagonal Matrices
+@subsection Creating Diagonal Matrices
+The most common and easiest way to create a diagonal matrix is using the built-in
+function @dfn{diag}. The expression @code{diag (@var{v})}, with @var{v} a vector,
+will create a square diagonal matrix with elements on the main diagonal given
+by the elements of @var{v}, and size equal to the length of @var{v}.
+@code{diag (@var{v}, m, n)} can be used to construct a rectangular diagonal matrix.
+The result of these expressions will be a special diagonal matrix object, rather
+than a general matrix object.
+Some other built-in functions can also return diagonal matrices. Examples include
+@dfn{balance} or @dfn{inv}.
+@DOCSTRING(diag)
+@node Creating Permutation Matrices
+@subsection Creating Permutation Matrices
+For creating permutation matrices, Octave does not introduce a new function, but
+rather overrides an existing syntax. That is, if @var{q} is a permutation vector
+of length @var{n}, the expression
+@example
+@var{P} = eye (@var{n}) (:, @var{q});
+@end example
+will create a permutation matrix - a special matrix object.
+@example
+eye (@var{n}) (@var{q}, :)
+@end example
+will also work (a row permutation matrix), as well as
+@example
+eye (@var{n}) (@var{q1}, @var{q2}).
+@end example
+Note that the identity, @code{eye (@var{n})}, is a diagonal matrix by definition,
+but should work in any place where a permutation matrix is requested.
+Some other built-in functions can also return permutation matrices. Examples include
+@dfn{inv} or @dfn{lu}.
+@node Explicit and Implicit Conversions
+@subsection Explicit and Implicit Conversions
+The diagonal and permutation matrices are special objects in their own right. A number
+of operations and built-in functions, are defined for these matrices to use special,
+more efficient code than would be used for a full matrix in the same place. Examples
+are given in further sections.
+To facilitate smooth mixing with full matrices, backward compatibility, and
+compatibility with Matlab, the diagonal and permutation matrices should allow
+any operation that works on full matrices, and will either treat is specially,
+or implicitly convert themselves to full matrices.
+Instances include matrix indexing (except for extracting a single element),
+indexed assignment, or applying most mapper functions, such as @dfn{exp}.
+An explicit conversion to a full matrix can be requested using the built-in
+function @dfn{full}. It should also be noted that the diagonal and permutation
+matrix objects will cache the result of the conversion after it is first
+requested (explicitly or implicitly), so that subsequent conversions will
+be very cheap.
+@node Matrix Algebra
+@section Linear Algebra with Diagonal and Permutation Matrices
+As has been already said, diagonal and permutation matrices make it
+possible to use efficient algorithms while preserving natural linear
+algebra syntax. This section describes in detail the operations that
+are treated specially when performed on these special matrix objects.
+@menu
+* Expressions Involving Diagonal Matrices::
+* Expressions Involving Permutation Matrices::
+@end menu
+@node Expressions Involving Diagonal Matrices
+@subsection Expressions Involving Diagonal Matrices
+Assume @var{D} is a diagonal matrix. If @var{M} is a full matrix,
+then @code{@var{D}*@var{M}} will scale the rows of @var{M}. That means,
+if @code{@var{S} = @var{D}*@var{M}}, then for each pair of indices
+i,j it holds
+@example
+S(i,j) = D(i,i) * M(i,j).
+@end example
+Similarly, @code{M*D} will do a column scaling.
+The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}.
+If @code{m < n}, then the expression @code{D*M} is equivalent to
+@example
+D(:,1:m) * M(1:m,:),
+@end example
+i.e. trailing @code{n-m} rows of @var{M} are ignored. If @code{m > n},
+then @code{D*M} is equivalent to
+@example
+[D(1:n,n) * M; zeros(m-n, columns (M))],
+@end example
+i.e. null rows are appended to the result.
+The situation for right-multiplication @code{M*D} is analogous.
+The expressions @code{D \ M} and @code{M / D} perform inverse scaling.
+They are equivalent to solving a diagonal (or rectangular diagonal)
+in a least-squares minimum-norm sense. In exact arithmetics, this is
+equivalent to multiplying by a pseudoinverse. The pseudoinverse of
+a rectangular diagonal matrix is again a rectangular diagonal matrix
+of the same dimensions, where each nonzero diagonal element is replaced
+by its reciprocal.
+The matrix division algorithms do, in fact, use division rather than
+multiplication by reciprocals for better numerical accuracy; otherwise, they
+honor the above definition.  Note that a diagonal matrix is never truncated due
+to ill-conditioning; otherwise, it would not be much useful for scaling. This
+is typically consistent with linear algebra needs. A full matrix that only
+happens to be diagonal (an is thus not a special object) is of course treated
+normally.
+If @var{D1} and @var{D2} are both diagonal matrices, then the expressions
+@example
+@var{D1} + @var{D2}
+@var{D1} - @var{D2}
+@var{D1} * @var{D2}
+@var{D1} / @var{D2}
+@var{D1} \ @var{D2}
+@end example
+again produce diagonal matrices, provided that normal
+dimension matching rules are obeyed. The relations used are same as described above.
+Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or raised
+to a scalar power if it is square, producing diagonal matrix result in all cases.
+A diagonal matrix can also be transposed or conjugate-transposed, giving the expected
+result. Extracting a leading submatrix of a diagonal matrix, i.e. @code{@var{D}(1:m,1:n)},
+will produce a diagonal matrix, other indexing expressions will implicitly convert to
+full matrix.
+When involved in expressions with the element-by-element operators @code{.*},
+@code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will
+also take place. This is not always strictly necessary but chosen to facilitate
+better consistency with Matlab.
+@node Expressions Involving Permutation Matrices
+@subsection Expressions Involving Permutation Matrices
+If @var{P} is a permutation matrix and @var{M} a matrix, the expression
+@code{P*M} will permute the rows of @var{M}. Similarly, @code{M*P} will
+yield a column permutation.
+Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation.
+The previously described syntax for creating permutation matrices can actually
+help an user to understand the connection between a permutation matrix and
+a permuting vector. Namely, the following holds, where @code{I = eye (n)}
+is an identity matrix:
+@example
+I(p,:) * M = (I*M) (p,:) = M(p,:)
+@end example
+Similarly,
+@example
+M * I(:,p) = (M*I) (:,p) = M(:,p)
+@end example
+The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices.
+A permutation matrix can be transposed (or conjugate-transposed, which is the
+same, because a permutation matrix is never complex), inverting the
+permutation, or equivalently, turning a row-permutation matrix into a
+column-permutation one. For permutation matrices, transpose is equivalent to
+inversion, thus @code{P\M} is equivalent to @code{P'*M}.  Transpose of a
+permutation matrix (or inverse) is a constant-time operation, flipping only a
+flag internally, and thus the choice between the two above equivalent
+expressions for inverse permuting is completely up to the user's taste.
+Two permutation matrices can be multiplied or divided (if their sizes match), performing
+a composition of permutations. Also a permutation matrix can be indexed by a permutation
+vector (or two vectors), giving again a permutation matrix.
+Any other operations do not generally yield a permutation matrix and will thus
+trigger the implicit conversion.
+@node Function Support
+@section Functions That Are Aware of These Matrices
+This section lists the built-in functions that are aware of diagonal and
+permutation matrices on input, or can return them as output. Passed to other
+functions, these matrices will in general trigger an implicit conversion.
+(Of course, user-defined dynamically linked functions may also work with
+diagonal or permutation matrices).
+@menu
+* Diagonal Matrix Functions::
+* Permutation Matrix Functions::
+@end menu
+@node Diagonal Matrix Functions
+@subsection Diagonal Matrix Functions
+@dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again
+a diagonal matrix. @dfn{det} will use an efficient straightforward calculation
+when given a diagonal matrix, as well as @dfn{cond}.
+The following mapper functions can be applied to a diagonal matrix
+without converting it to a full one:
+@dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}.
+A diagonal matrix can also be returned from the @dfn{balance}
+and @dfn{svd} functions.
+@node Permutation Matrix Functions
+@subsection Permutation Matrix Functions
+@dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its
+specialness. @dfn{det} can be applied to a permutation matrix, efficiently
+calculating the sign of the permutation (which is equal to the determinant).
+A permutation matrix can also be returned from the built-in functions
+@dfn{lu} and @dfn{qr}, if a pivoted factorization is requested.
+@node Example Codes
+@section Some Examples of Usage
+The following can be used to solve a linear system @code{A*x = b}
+using the pivoted LU factorization:
+@example
+[L, U, P] = lu (A); ## now L*U = P*A
+x = U \ L \ P*b;
+@end example
+This is how you normalize columns of a matrix @var{X} to unit norm:
+@example
+s = norm (X, "columns");
+X = diag (s) \ X;
+@end example
+The following expression is a way to efficiently calculate the sign of a
+permutation, given by a permutation vector @var{p}. It will also work
+in earlier versions of Octave, but slowly.
+@example
+det (eye (length (p))(p, :))
+@end example
+Finally, here's how you solve a linear system @code{A*x = b}
+with Tikhonov regularization using SVD (a skeleton only):
+@example
+m = rows (A); n = columns (A);
+[U, S, V] = svd (A);
+## determine the regularization factor alpha
+## ...
+## and solve
+x = U'*b;
+x = (S + alpha*eye (m, n)) \ x; ## this will be very efficient
+x = V*x;
+@end example
+@node Zeros Treatment
+@section The Differences in Treatment of Zero Elements
+Making diagonal and permutation matrices special matrix objects in their own
+right and the consequent usage of smarter algorithms for certain operations
+implies, as a side effect, small differences in treating zeros.
+The contents of this section applies also to sparse matrices, discussed in
+the following chapter.
+The IEEE standard defines the result of the expressions @code{0*Inf} and
+@code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the
+best compromise.
+Numerical software dealing with structured and sparse matrices (including
+Octave) however, almost always makes a distinction between a "numerical zero"
+and an "assumed zero".
+A "numerical zero" is a zero value occuring in a place where any floating-point
+value could occur. It is normally stored somewhere in memory as an explicit
+value.
+An "assumed zero", on the contrary, is a zero matrix element implied by the
+matrix structure (diagonal, triangular) or a sparsity pattern; its value is
+usually not stored explicitly anywhere, but is implied by the underlying
+data structure.
+The primary distinction is that an assumed zero, when multiplied or divided
+by any number, yields *always* a zero, even when, e.g., multiplied by @code{Inf}
+or divided by @code{NaN}.
+The reason for this behavior is that the numerical multiplication is not
+actually performed anywhere by the underlying algorithm; the result is
+just assumed to be zero. Equivalently, one can say that the part of the
+computation involving assumed zeros is performed symbolically, not numerically.
+This behavior not only facilitates the most straightforward and efficient
+implementation of algorithms, but also preserves certain useful invariants,
+like:
+@itemize
+@item scalar * diagonal matrix is a diagonal matrix
+@item sparse matrix / scalar preserves the sparsity pattern
+@item permutation matrix * matrix is equivalent to permuting rows
+@end itemize
+all of these natural mathematical truths would be invalidated by treating
+assumed zeros as numerical ones.
+Note that certain competing software does not strictly follow this principle
+and converts assumed zeros to numerical zeros in certain cases, while not doing
+so in other cases. As of today, there are no intentions to mimick such behavior
+in Octave.
+Examples of effects of assumed zeros vs. numerical zeros:
+@example
+Inf * eye (3)
+@result{}
+Inf     0     0
+0   Inf     0
+0     0   Inf
+Inf * speye (3)
+@result{}
+Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%])
+(1, 1) -> Inf
+(2, 2) -> Inf
+(3, 3) -> Inf
+Inf * full (eye (3))
+@result{}
+Inf   NaN   NaN
+NaN   Inf   NaN
+NaN   NaN   Inf
+@end example
+@example
+diag(1:3) * [NaN; 1; 1]
+@result{}
+NaN
+2
+3
+sparse(1:3,1:3,1:3) * [NaN; 1; 1]
+@result{}
+NaN
+2
+3
+[1,0,0;0,2,0;0,0,3] * [NaN; 1; 1]
+@result{}
+NaN
+NaN
+NaN
+@end example

Mercurial > hg > octave-lyh

comparison doc/interpreter/diagperm.txi @ 8839:fcba62cc4549