Mercurial > hg > octave-nkf

@c Copyright (C) 2009-2012 Jaroslav Hajek
@c
@c This file is part of Octave.
@c
@c Octave is free software; you can redistribute it and/or modify it
@c under the terms of the GNU General Public License as published by the
@c Free Software Foundation; either version 3 of the License, or (at
@c your option) any later version.
@c
@c Octave is distributed in the hope that it will be useful, but WITHOUT
@c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
@c FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
@c for more details.
@c
@c You should have received a copy of the GNU General Public License
@c along with Octave; see the file COPYING.  If not, see
@c <http://www.gnu.org/licenses/>.

@node Diagonal and Permutation Matrices
@chapter Diagonal and Permutation Matrices

@menu
* Basic Usage::          Creation and Manipulation of Diagonal and Permutation Matrices
* Matrix Algebra::       Linear Algebra with Diagonal and Permutation Matrices
* Function Support::     Functions That Are Aware of These Matrices
* Example Code::         Some Examples of Usage
* Zeros Treatment::      The Differences in Treatment of Zero Elements
@end menu

@node Basic Usage
@section Creating and Manipulating Diagonal and Permutation Matrices

A diagonal matrix is defined as a matrix that has zero entries outside the main
diagonal; that is,
@tex
$D_{ij} = 0$ if $i \neq j$
@end tex
@ifnottex
@code{D(i,j) == 0} if @code{i != j}.
@end ifnottex
Most often, square diagonal matrices are considered; however, the definition can
equally be applied to non-square matrices, in which case we usually speak of a
rectangular diagonal matrix.

A permutation matrix is defined as a square matrix that has a single element
equal to unity in each row and each column; all other elements are zero.  That
is, there exists a permutation (vector)
@tex
$p$ such that $P_{ij}=1$ if $j = p_i$ and
$P_{ij}=0$ otherwise.
@end tex
@ifnottex
@code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and
@code{P(i,j) == 0} otherwise.
@end ifnottex

Octave provides special treatment of real and complex rectangular diagonal
matrices, as well as permutation matrices.  They are stored as special objects,
using efficient storage and algorithms, facilitating writing both readable and
efficient matrix algebra expressions in the Octave language.

@menu
* Creating Diagonal Matrices::
* Creating Permutation Matrices::
* Explicit and Implicit Conversions::
@end menu

@node Creating Diagonal Matrices
@subsection Creating Diagonal Matrices

The most common and easiest way to create a diagonal matrix is using the
built-in function @dfn{diag}.  The expression @code{diag (v)}, with @var{v} a
vector, will create a square diagonal matrix with elements on the main diagonal
given by the elements of @var{v}, and size equal to the length of @var{v}.
@code{diag (v, m, n)} can be used to construct a rectangular diagonal matrix.
The result of these expressions will be a special diagonal matrix object, rather
than a general matrix object.

Diagonal matrix with unit elements can be created using @dfn{eye}.
Some other built-in functions can also return diagonal matrices.  Examples
include
@dfn{balance} or @dfn{inv}.

Example:

@example
  diag (1:4)
@result{}
Diagonal Matrix

   1   0   0   0
   0   2   0   0
   0   0   3   0
   0   0   0   4

  diag (1:3,5,3)

@result{}
Diagonal Matrix

   1   0   0
   0   2   0
   0   0   3
   0   0   0
   0   0   0
@end example

@node Creating Permutation Matrices
@subsection Creating Permutation Matrices

For creating permutation matrices, Octave does not introduce a new function, but
rather overrides an existing syntax: permutation matrices can be conveniently
created by indexing an identity matrix by permutation vectors.
That is, if @var{q} is a permutation vector of length @var{n}, the expression

@example
  P = eye (n) (:, q);
@end example

@noindent
will create a permutation matrix - a special matrix object.

@example
eye (n) (q, :)
@end example

@noindent
will also work (and create a row permutation matrix), as well as

@example
eye (n) (q1, q2).
@end example

For example:

@example
@group
  eye (4) ([1,3,2,4],:)
@result{}
Permutation Matrix

   1   0   0   0
   0   0   1   0
   0   1   0   0
   0   0   0   1

  eye (4) (:,[1,3,2,4])
@result{}
Permutation Matrix

   1   0   0   0
   0   0   1   0
   0   1   0   0
   0   0   0   1
@end group
@end example

Mathematically, an identity matrix is both diagonal and permutation matrix.
In Octave, @code{eye (n)} returns a diagonal matrix, because a matrix
can only have one class.  You can convert this diagonal matrix to a permutation
matrix by indexing it by an identity permutation, as shown below.
This is a special property of the identity matrix; indexing other diagonal
matrices generally produces a full matrix.

@example
@group
  eye (3)
@result{}
Diagonal Matrix

   1   0   0
   0   1   0
   0   0   1

  eye(3)(1:3,:)
@result{}
Permutation Matrix

   1   0   0
   0   1   0
   0   0   1
@end group
@end example

Some other built-in functions can also return permutation matrices.  Examples
include
@dfn{inv} or @dfn{lu}.

@node Explicit and Implicit Conversions
@subsection Explicit and Implicit Conversions

The diagonal and permutation matrices are special objects in their own right.  A
number of operations and built-in functions are defined for these matrices to
use special, more efficient code than would be used for a full matrix in the
same place.  Examples are given in further sections.

To facilitate smooth mixing with full matrices, backward compatibility, and
compatibility with @sc{matlab}, the diagonal and permutation matrices should
allow any operation that works on full matrices, and will either treat it
specially, or implicitly convert themselves to full matrices.

Instances include matrix indexing, except for extracting a single element or
a leading submatrix, indexed assignment, or applying most mapper functions,
such as @dfn{exp}.

An explicit conversion to a full matrix can be requested using the built-in
function @dfn{full}.  It should also be noted that the diagonal and permutation
matrix objects will cache the result of the conversion after it is first
requested (explicitly or implicitly), so that subsequent conversions will
be very cheap.

@node Matrix Algebra
@section Linear Algebra with Diagonal and Permutation Matrices

As has been already said, diagonal and permutation matrices make it
possible to use efficient algorithms while preserving natural linear
algebra syntax.  This section describes in detail the operations that
are treated specially when performed on these special matrix objects.

@menu
* Expressions Involving Diagonal Matrices::
* Expressions Involving Permutation Matrices::
@end menu

@node Expressions Involving Diagonal Matrices
@subsection Expressions Involving Diagonal Matrices

Assume @var{D} is a diagonal matrix.  If @var{M} is a full matrix,
then @code{D*M} will scale the rows of @var{M}.  That means,
if @code{S = D*M}, then for each pair of indices
i,j it holds
@tex
$$S_{ij} = D_{ii} M_{ij}$$
@end tex
@ifnottex

@example
S(i,j) = D(i,i) * M(i,j).
@end example

@end ifnottex
Similarly, @code{M*D} will do a column scaling.

The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}.
If @code{m < n}, then the expression @code{D*M} is equivalent to

@example
D(:,1:m) * M(1:m,:),
@end example

@noindent
i.e., trailing @code{n-m} rows of @var{M} are ignored.  If @code{m > n},
then @code{D*M} is equivalent to

@example
[D(1:n,n) * M; zeros(m-n, columns (M))],
@end example

@noindent
i.e., null rows are appended to the result.
The situation for right-multiplication @code{M*D} is analogous.

The expressions @code{D \ M} and @code{M / D} perform inverse scaling.
They are equivalent to solving a diagonal (or rectangular diagonal)
in a least-squares minimum-norm sense.  In exact arithmetic, this is
equivalent to multiplying by a pseudoinverse.  The pseudoinverse of
a rectangular diagonal matrix is again a rectangular diagonal matrix
with swapped dimensions, where each nonzero diagonal element is replaced
by its reciprocal.
The matrix division algorithms do, in fact, use division rather than
multiplication by reciprocals for better numerical accuracy; otherwise, they
honor the above definition.  Note that a diagonal matrix is never truncated due
to ill-conditioning; otherwise, it would not be much useful for scaling.  This
is typically consistent with linear algebra needs.  A full matrix that only
happens to be diagonal (an is thus not a special object) is of course treated
normally.

Multiplication and division by diagonal matrices works efficiently also when
combined with sparse matrices, i.e., @code{D*S}, where @var{D} is a diagonal
matrix and @var{S} is a sparse matrix scales the rows of the sparse matrix and
returns a sparse matrix.  The expressions @code{S*D}, @code{D\S}, @code{S/D}
work analogically.

If @var{D1} and @var{D2} are both diagonal matrices, then the expressions

@example
@group
D1 + D2
D1 - D2
D1 * D2
D1 / D2
D1 \ D2
@end group
@end example

@noindent
again produce diagonal matrices, provided that normal
dimension matching rules are obeyed.  The relations used are same as described
above.

Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or
raised to a scalar power if it is square, producing diagonal matrix result in
all cases.

A diagonal matrix can also be transposed or conjugate-transposed, giving the
expected result.  Extracting a leading submatrix of a diagonal matrix, i.e.,
@code{D(1:m,1:n)}, will produce a diagonal matrix, other indexing expressions
will implicitly convert to full matrix.

Adding a diagonal matrix to a full matrix only operates on the diagonal
elements.  Thus,

@example
A = A + eps * eye (n)
@end example

@noindent
is an efficient method of augmenting the diagonal of a matrix.  Subtraction
works analogically.

When involved in expressions with other element-by-element operators, @code{.*},
@code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will
take place.  This is not always strictly necessary but chosen to facilitate
better consistency with @sc{matlab}.

@node Expressions Involving Permutation Matrices
@subsection Expressions Involving Permutation Matrices

If @var{P} is a permutation matrix and @var{M} a matrix, the expression
@code{P*M} will permute the rows of @var{M}.  Similarly, @code{M*P} will
yield a column permutation.
Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation.

The previously described syntax for creating permutation matrices can actually
help an user to understand the connection between a permutation matrix and
a permuting vector.  Namely, the following holds, where @code{I = eye (n)}
is an identity matrix:

@example
  I(p,:) * M = (I*M) (p,:) = M(p,:)
@end example

Similarly,

@example
  M * I(:,p) = (M*I) (:,p) = M(:,p)
@end example

The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices.

A permutation matrix can be transposed (or conjugate-transposed, which is the
same, because a permutation matrix is never complex), inverting the
permutation, or equivalently, turning a row-permutation matrix into a
column-permutation one.  For permutation matrices, transpose is equivalent to
inversion, thus @code{P\M} is equivalent to @code{P'*M}.  Transpose of a
permutation matrix (or inverse) is a constant-time operation, flipping only a
flag internally, and thus the choice between the two above equivalent
expressions for inverse permuting is completely up to the user's taste.

Multiplication and division by permutation matrices works efficiently also when
combined with sparse matrices, i.e., @code{P*S}, where @var{P} is a permutation
matrix and @var{S} is a sparse matrix permutes the rows of the sparse matrix and
returns a sparse matrix.  The expressions @code{S*P}, @code{P\S}, @code{S/P}
work analogically.

Two permutation matrices can be multiplied or divided (if their sizes match),
performing a composition of permutations.  Also a permutation matrix can be
indexed by a permutation vector (or two vectors), giving again a permutation
matrix.  Any other operations do not generally yield a permutation matrix and
will thus trigger the implicit conversion.

@node Function Support
@section Functions That Are Aware of These Matrices

This section lists the built-in functions that are aware of diagonal and
permutation matrices on input, or can return them as output.  Passed to other
functions, these matrices will in general trigger an implicit conversion.
(Of course, user-defined dynamically linked functions may also work with
diagonal or permutation matrices).

@menu
* Diagonal Matrix Functions::
* Permutation Matrix Functions::
@end menu

@node Diagonal Matrix Functions
@subsection Diagonal Matrix Functions

@dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again
a diagonal matrix.  @dfn{det} will use an efficient straightforward calculation
when given a diagonal matrix, as well as @dfn{cond}.
The following mapper functions can be applied to a diagonal matrix
without converting it to a full one:
@dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}.
A diagonal matrix can also be returned from the @dfn{balance}
and @dfn{svd} functions.
The @dfn{sparse} function will convert a diagonal matrix efficiently to a
sparse matrix.

@node Permutation Matrix Functions
@subsection Permutation Matrix Functions

@dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its
specialness.  @dfn{det} can be applied to a permutation matrix, efficiently
calculating the sign of the permutation (which is equal to the determinant).

A permutation matrix can also be returned from the built-in functions
@dfn{lu} and @dfn{qr}, if a pivoted factorization is requested.

The @dfn{sparse} function will convert a permutation matrix efficiently to a
sparse matrix.
The @dfn{find} function will also work efficiently with a permutation matrix,
making it possible to conveniently obtain the permutation indices.

@node Example Code
@section Some Examples of Usage

The following can be used to solve a linear system @code{A*x = b}
using the pivoted LU@tie{}factorization:

@example
@group
  [L, U, P] = lu (A); ## now L*U = P*A
  x = U \ L \ P*b;
@end group
@end example

@noindent
This is one way to normalize columns of a matrix @var{X} to unit norm:

@example
@group
  s = norm (X, "columns");
  X /= diag (s);
@end group
@end example

@noindent
The same can also be accomplished with broadcasting
(@pxref{Broadcasting}):

@example
@group
  s = norm (X, "columns");
  X ./= s;
@end group
@end example

@noindent
The following expression is a way to efficiently calculate the sign of a
permutation, given by a permutation vector @var{p}.  It will also work
in earlier versions of Octave, but slowly.

@example
  det (eye (length (p))(p, :))
@end example

@noindent
Finally, here's how you solve a linear system @code{A*x = b}
with Tikhonov regularization (ridge regression) using SVD (a skeleton only):

@example
@group
  m = rows (A); n = columns (A);
  [U, S, V] = svd (A);
  ## determine the regularization factor alpha
  ## alpha = @dots{}
  ## transform to orthogonal basis
  b = U'*b;
  ## Use the standard formula, replacing A with S.
  ## S is diagonal, so the following will be very fast and accurate.
  x = (S'*S + alpha^2 * eye (n)) \ (S' * b);
  ## transform to solution basis
  x = V*x;
@end group
@end example


@node Zeros Treatment
@section The Differences in Treatment of Zero Elements

Making diagonal and permutation matrices special matrix objects in their own
right and the consequent usage of smarter algorithms for certain operations
implies, as a side effect, small differences in treating zeros.
The contents of this section applies also to sparse matrices, discussed in
the following chapter.

The IEEE standard defines the result of the expressions @code{0*Inf} and
@code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the
best compromise.
Numerical software dealing with structured and sparse matrices (including
Octave) however, almost always makes a distinction between a "numerical zero"
and an "assumed zero".
A "numerical zero" is a zero value occurring in a place where any floating-point
value could occur.  It is normally stored somewhere in memory as an explicit
value.
An "assumed zero", on the contrary, is a zero matrix element implied by the
matrix structure (diagonal, triangular) or a sparsity pattern; its value is
usually not stored explicitly anywhere, but is implied by the underlying
data structure.

The primary distinction is that an assumed zero, when multiplied
by any number, or divided by any nonzero number,
yields *always* a zero, even when, e.g., multiplied by @code{Inf}
or divided by @code{NaN}.
The reason for this behavior is that the numerical multiplication is not
actually performed anywhere by the underlying algorithm; the result is
just assumed to be zero.  Equivalently, one can say that the part of the
computation involving assumed zeros is performed symbolically, not numerically.

This behavior not only facilitates the most straightforward and efficient
implementation of algorithms, but also preserves certain useful invariants,
like:

@itemize
@item scalar * diagonal matrix is a diagonal matrix

@item sparse matrix / scalar preserves the sparsity pattern

@item permutation matrix * matrix is equivalent to permuting rows
@end itemize
all of these natural mathematical truths would be invalidated by treating
assumed zeros as numerical ones.

Note that @sc{matlab} does not strictly follow this principle and converts
assumed zeros to numerical zeros in certain cases, while not doing so in
other cases.  As of today, there are no intentions to mimic such behavior
in Octave.

Examples of effects of assumed zeros vs. numerical zeros:

@example
Inf * eye (3)
@result{}
   Inf     0     0
     0   Inf     0
     0     0   Inf

Inf * speye (3)
@result{}
Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%])

  (1, 1) -> Inf
  (2, 2) -> Inf
  (3, 3) -> Inf

Inf * full (eye (3))
@result{}
   Inf   NaN   NaN
   NaN   Inf   NaN
   NaN   NaN   Inf

@end example

@example
@group
diag (1:3) * [NaN; 1; 1]
@result{}
   NaN
     2
     3

sparse (1:3,1:3,1:3) * [NaN; 1; 1]
@result{}
   NaN
     2
     3
[1,0,0;0,2,0;0,0,3] * [NaN; 1; 1]
@result{}
   NaN
   NaN
   NaN
@end group
@end example
author	Jordi Gutiérrez Hermoso <jordigh@octave.org>
date	Wed, 01 May 2013 15:29:57 -0400
parents	c3fd61c59e9c
children	12005245b645