comparison doc/interpreter/diagperm.txi @ 8839:fcba62cc4549

add chapter about diagonal and permutation matrices to manual
author Jaroslav Hajek <highegg@gmail.com>
date Mon, 23 Feb 2009 13:55:44 +0100
parents
children d6de39523f03
comparison
equal deleted inserted replaced
8838:dea5bd01e6d7 8839:fcba62cc4549
1 @c Copyright (C) 2009 Jaroslav Hajek
2 @c
3 @c This file is part of Octave.
4 @c
5 @c Octave is free software; you can redistribute it and/or modify it
6 @c under the terms of the GNU General Public License as published by the
7 @c Free Software Foundation; either version 3 of the License, or (at
8 @c your option) any later version.
9 @c
10 @c Octave is distributed in the hope that it will be useful, but WITHOUT
11 @c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
12 @c FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
13 @c for more details.
14 @c
15 @c You should have received a copy of the GNU General Public License
16 @c along with Octave; see the file COPYING. If not, see
17 @c <http://www.gnu.org/licenses/>.
18
19 @ifhtml
20 @set htmltex
21 @end ifhtml
22 @iftex
23 @set htmltex
24 @end iftex
25
26 @node Diagonal and Permutation Matrices
27 @chapter Diagonal and Permutation Matrices
28
29 @menu
30 * Basic Usage:: Creation and Manipulation of Diagonal and Permutation Matrices
31 * Matrix Algebra:: Linear Algebra with Diagonal and Permutation Matrices
32 * Function Support:: Functions That Are Aware of These Matrices
33 * Example Codes:: Some Examples of Usage
34 * Zeros Treatment:: The Differences in Treatment of Zero Elements
35 @end menu
36
37 @node Basic Usage
38 @section The Creation and Manipulation of Diagonal and Permutation Matrices
39
40 A diagonal matrix is defined as a matrix that has zero entries outside the main diagonal;
41 that is, @code{D(i,j) == 0} if @code{i != j}.
42 Most often, square diagonal matrices are considered; however, the definition can equally
43 be applied to nonsquare matrices, in which case we usually speak of a rectangular diagonal
44 matrix. For more information, see @url{http://en.wikipedia.org/wiki/Diagonal_matrix}.
45
46 A permutation matrix is defined as a square matrix that has a single element equal to unity
47 in each row and each column; all other elements are zero. That is, there exists a
48 permutation (vector) @code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and
49 @code{P(i,j) == 0} otherwise. For more information, see
50 @url{http://en.wikipedia.org/wiki/Permutation_matrix}.
51
52 Octave provides special treatment of real and complex rectangular diagonal matrices,
53 as well as permutation matrices. They are stored as special objects, using efficient
54 storage and algorithms, facilitating writing both readable and efficient matrix algebra
55 expressions in the Octave language.
56
57 @menu
58 * Creating Diagonal Matrices::
59 * Creating Permutation Matrices::
60 * Explicit and Implicit Conversions::
61 @end menu
62
63 @node Creating Diagonal Matrices
64 @subsection Creating Diagonal Matrices
65
66 The most common and easiest way to create a diagonal matrix is using the built-in
67 function @dfn{diag}. The expression @code{diag (@var{v})}, with @var{v} a vector,
68 will create a square diagonal matrix with elements on the main diagonal given
69 by the elements of @var{v}, and size equal to the length of @var{v}.
70 @code{diag (@var{v}, m, n)} can be used to construct a rectangular diagonal matrix.
71 The result of these expressions will be a special diagonal matrix object, rather
72 than a general matrix object.
73
74 Some other built-in functions can also return diagonal matrices. Examples include
75 @dfn{balance} or @dfn{inv}.
76
77 @DOCSTRING(diag)
78
79 @node Creating Permutation Matrices
80 @subsection Creating Permutation Matrices
81
82 For creating permutation matrices, Octave does not introduce a new function, but
83 rather overrides an existing syntax. That is, if @var{q} is a permutation vector
84 of length @var{n}, the expression
85 @example
86 @var{P} = eye (@var{n}) (:, @var{q});
87 @end example
88 will create a permutation matrix - a special matrix object.
89 @example
90 eye (@var{n}) (@var{q}, :)
91 @end example
92 will also work (a row permutation matrix), as well as
93 @example
94 eye (@var{n}) (@var{q1}, @var{q2}).
95 @end example
96 Note that the identity, @code{eye (@var{n})}, is a diagonal matrix by definition,
97 but should work in any place where a permutation matrix is requested.
98
99 Some other built-in functions can also return permutation matrices. Examples include
100 @dfn{inv} or @dfn{lu}.
101
102 @node Explicit and Implicit Conversions
103 @subsection Explicit and Implicit Conversions
104
105 The diagonal and permutation matrices are special objects in their own right. A number
106 of operations and built-in functions, are defined for these matrices to use special,
107 more efficient code than would be used for a full matrix in the same place. Examples
108 are given in further sections.
109
110 To facilitate smooth mixing with full matrices, backward compatibility, and
111 compatibility with Matlab, the diagonal and permutation matrices should allow
112 any operation that works on full matrices, and will either treat is specially,
113 or implicitly convert themselves to full matrices.
114
115 Instances include matrix indexing (except for extracting a single element),
116 indexed assignment, or applying most mapper functions, such as @dfn{exp}.
117
118 An explicit conversion to a full matrix can be requested using the built-in
119 function @dfn{full}. It should also be noted that the diagonal and permutation
120 matrix objects will cache the result of the conversion after it is first
121 requested (explicitly or implicitly), so that subsequent conversions will
122 be very cheap.
123
124 @node Matrix Algebra
125 @section Linear Algebra with Diagonal and Permutation Matrices
126
127 As has been already said, diagonal and permutation matrices make it
128 possible to use efficient algorithms while preserving natural linear
129 algebra syntax. This section describes in detail the operations that
130 are treated specially when performed on these special matrix objects.
131
132 @menu
133 * Expressions Involving Diagonal Matrices::
134 * Expressions Involving Permutation Matrices::
135 @end menu
136
137 @node Expressions Involving Diagonal Matrices
138 @subsection Expressions Involving Diagonal Matrices
139
140 Assume @var{D} is a diagonal matrix. If @var{M} is a full matrix,
141 then @code{@var{D}*@var{M}} will scale the rows of @var{M}. That means,
142 if @code{@var{S} = @var{D}*@var{M}}, then for each pair of indices
143 i,j it holds
144 @example
145 S(i,j) = D(i,i) * M(i,j).
146 @end example
147 Similarly, @code{M*D} will do a column scaling.
148
149 The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}.
150 If @code{m < n}, then the expression @code{D*M} is equivalent to
151 @example
152 D(:,1:m) * M(1:m,:),
153 @end example
154 i.e. trailing @code{n-m} rows of @var{M} are ignored. If @code{m > n},
155 then @code{D*M} is equivalent to
156 @example
157 [D(1:n,n) * M; zeros(m-n, columns (M))],
158 @end example
159 i.e. null rows are appended to the result.
160 The situation for right-multiplication @code{M*D} is analogous.
161
162 The expressions @code{D \ M} and @code{M / D} perform inverse scaling.
163 They are equivalent to solving a diagonal (or rectangular diagonal)
164 in a least-squares minimum-norm sense. In exact arithmetics, this is
165 equivalent to multiplying by a pseudoinverse. The pseudoinverse of
166 a rectangular diagonal matrix is again a rectangular diagonal matrix
167 of the same dimensions, where each nonzero diagonal element is replaced
168 by its reciprocal.
169 The matrix division algorithms do, in fact, use division rather than
170 multiplication by reciprocals for better numerical accuracy; otherwise, they
171 honor the above definition. Note that a diagonal matrix is never truncated due
172 to ill-conditioning; otherwise, it would not be much useful for scaling. This
173 is typically consistent with linear algebra needs. A full matrix that only
174 happens to be diagonal (an is thus not a special object) is of course treated
175 normally.
176
177 If @var{D1} and @var{D2} are both diagonal matrices, then the expressions
178 @example
179 @var{D1} + @var{D2}
180 @var{D1} - @var{D2}
181 @var{D1} * @var{D2}
182 @var{D1} / @var{D2}
183 @var{D1} \ @var{D2}
184 @end example
185 again produce diagonal matrices, provided that normal
186 dimension matching rules are obeyed. The relations used are same as described above.
187
188 Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or raised
189 to a scalar power if it is square, producing diagonal matrix result in all cases.
190
191 A diagonal matrix can also be transposed or conjugate-transposed, giving the expected
192 result. Extracting a leading submatrix of a diagonal matrix, i.e. @code{@var{D}(1:m,1:n)},
193 will produce a diagonal matrix, other indexing expressions will implicitly convert to
194 full matrix.
195
196 When involved in expressions with the element-by-element operators @code{.*},
197 @code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will
198 also take place. This is not always strictly necessary but chosen to facilitate
199 better consistency with Matlab.
200
201 @node Expressions Involving Permutation Matrices
202 @subsection Expressions Involving Permutation Matrices
203
204 If @var{P} is a permutation matrix and @var{M} a matrix, the expression
205 @code{P*M} will permute the rows of @var{M}. Similarly, @code{M*P} will
206 yield a column permutation.
207 Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation.
208
209 The previously described syntax for creating permutation matrices can actually
210 help an user to understand the connection between a permutation matrix and
211 a permuting vector. Namely, the following holds, where @code{I = eye (n)}
212 is an identity matrix:
213 @example
214 I(p,:) * M = (I*M) (p,:) = M(p,:)
215 @end example
216 Similarly,
217 @example
218 M * I(:,p) = (M*I) (:,p) = M(:,p)
219 @end example
220
221 The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices.
222
223 A permutation matrix can be transposed (or conjugate-transposed, which is the
224 same, because a permutation matrix is never complex), inverting the
225 permutation, or equivalently, turning a row-permutation matrix into a
226 column-permutation one. For permutation matrices, transpose is equivalent to
227 inversion, thus @code{P\M} is equivalent to @code{P'*M}. Transpose of a
228 permutation matrix (or inverse) is a constant-time operation, flipping only a
229 flag internally, and thus the choice between the two above equivalent
230 expressions for inverse permuting is completely up to the user's taste.
231
232 Two permutation matrices can be multiplied or divided (if their sizes match), performing
233 a composition of permutations. Also a permutation matrix can be indexed by a permutation
234 vector (or two vectors), giving again a permutation matrix.
235 Any other operations do not generally yield a permutation matrix and will thus
236 trigger the implicit conversion.
237
238 @node Function Support
239 @section Functions That Are Aware of These Matrices
240
241 This section lists the built-in functions that are aware of diagonal and
242 permutation matrices on input, or can return them as output. Passed to other
243 functions, these matrices will in general trigger an implicit conversion.
244 (Of course, user-defined dynamically linked functions may also work with
245 diagonal or permutation matrices).
246
247 @menu
248 * Diagonal Matrix Functions::
249 * Permutation Matrix Functions::
250 @end menu
251
252 @node Diagonal Matrix Functions
253 @subsection Diagonal Matrix Functions
254
255 @dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again
256 a diagonal matrix. @dfn{det} will use an efficient straightforward calculation
257 when given a diagonal matrix, as well as @dfn{cond}.
258 The following mapper functions can be applied to a diagonal matrix
259 without converting it to a full one:
260 @dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}.
261 A diagonal matrix can also be returned from the @dfn{balance}
262 and @dfn{svd} functions.
263
264 @node Permutation Matrix Functions
265 @subsection Permutation Matrix Functions
266
267 @dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its
268 specialness. @dfn{det} can be applied to a permutation matrix, efficiently
269 calculating the sign of the permutation (which is equal to the determinant).
270
271 A permutation matrix can also be returned from the built-in functions
272 @dfn{lu} and @dfn{qr}, if a pivoted factorization is requested.
273
274 @node Example Codes
275 @section Some Examples of Usage
276
277 The following can be used to solve a linear system @code{A*x = b}
278 using the pivoted LU factorization:
279 @example
280 [L, U, P] = lu (A); ## now L*U = P*A
281 x = U \ L \ P*b;
282 @end example
283
284 This is how you normalize columns of a matrix @var{X} to unit norm:
285 @example
286 s = norm (X, "columns");
287 X = diag (s) \ X;
288 @end example
289
290 The following expression is a way to efficiently calculate the sign of a
291 permutation, given by a permutation vector @var{p}. It will also work
292 in earlier versions of Octave, but slowly.
293 @example
294 det (eye (length (p))(p, :))
295 @end example
296
297 Finally, here's how you solve a linear system @code{A*x = b}
298 with Tikhonov regularization using SVD (a skeleton only):
299 @example
300 m = rows (A); n = columns (A);
301 [U, S, V] = svd (A);
302 ## determine the regularization factor alpha
303 ## ...
304 ## and solve
305 x = U'*b;
306 x = (S + alpha*eye (m, n)) \ x; ## this will be very efficient
307 x = V*x;
308 @end example
309
310 @node Zeros Treatment
311 @section The Differences in Treatment of Zero Elements
312
313 Making diagonal and permutation matrices special matrix objects in their own
314 right and the consequent usage of smarter algorithms for certain operations
315 implies, as a side effect, small differences in treating zeros.
316 The contents of this section applies also to sparse matrices, discussed in
317 the following chapter.
318
319 The IEEE standard defines the result of the expressions @code{0*Inf} and
320 @code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the
321 best compromise.
322 Numerical software dealing with structured and sparse matrices (including
323 Octave) however, almost always makes a distinction between a "numerical zero"
324 and an "assumed zero".
325 A "numerical zero" is a zero value occuring in a place where any floating-point
326 value could occur. It is normally stored somewhere in memory as an explicit
327 value.
328 An "assumed zero", on the contrary, is a zero matrix element implied by the
329 matrix structure (diagonal, triangular) or a sparsity pattern; its value is
330 usually not stored explicitly anywhere, but is implied by the underlying
331 data structure.
332
333 The primary distinction is that an assumed zero, when multiplied or divided
334 by any number, yields *always* a zero, even when, e.g., multiplied by @code{Inf}
335 or divided by @code{NaN}.
336 The reason for this behavior is that the numerical multiplication is not
337 actually performed anywhere by the underlying algorithm; the result is
338 just assumed to be zero. Equivalently, one can say that the part of the
339 computation involving assumed zeros is performed symbolically, not numerically.
340
341 This behavior not only facilitates the most straightforward and efficient
342 implementation of algorithms, but also preserves certain useful invariants,
343 like:
344 @itemize
345 @item scalar * diagonal matrix is a diagonal matrix
346 @item sparse matrix / scalar preserves the sparsity pattern
347 @item permutation matrix * matrix is equivalent to permuting rows
348 @end itemize
349 all of these natural mathematical truths would be invalidated by treating
350 assumed zeros as numerical ones.
351
352 Note that certain competing software does not strictly follow this principle
353 and converts assumed zeros to numerical zeros in certain cases, while not doing
354 so in other cases. As of today, there are no intentions to mimick such behavior
355 in Octave.
356
357 Examples of effects of assumed zeros vs. numerical zeros:
358 @example
359 Inf * eye (3)
360 @result{}
361 Inf 0 0
362 0 Inf 0
363 0 0 Inf
364
365 Inf * speye (3)
366 @result{}
367 Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%])
368
369 (1, 1) -> Inf
370 (2, 2) -> Inf
371 (3, 3) -> Inf
372
373 Inf * full (eye (3))
374 @result{}
375 Inf NaN NaN
376 NaN Inf NaN
377 NaN NaN Inf
378
379 @end example
380
381 @example
382 diag(1:3) * [NaN; 1; 1]
383 @result{}
384 NaN
385 2
386 3
387
388 sparse(1:3,1:3,1:3) * [NaN; 1; 1]
389 @result{}
390 NaN
391 2
392 3
393 [1,0,0;0,2,0;0,0,3] * [NaN; 1; 1]
394 @result{}
395 NaN
396 NaN
397 NaN
398 @end example
399