Mercurial > hg > octave-lyh
comparison doc/interpreter/diagperm.txi @ 8839:fcba62cc4549
add chapter about diagonal and permutation matrices to manual
author | Jaroslav Hajek <highegg@gmail.com> |
---|---|
date | Mon, 23 Feb 2009 13:55:44 +0100 |
parents | |
children | d6de39523f03 |
comparison
equal
deleted
inserted
replaced
8838:dea5bd01e6d7 | 8839:fcba62cc4549 |
---|---|
1 @c Copyright (C) 2009 Jaroslav Hajek | |
2 @c | |
3 @c This file is part of Octave. | |
4 @c | |
5 @c Octave is free software; you can redistribute it and/or modify it | |
6 @c under the terms of the GNU General Public License as published by the | |
7 @c Free Software Foundation; either version 3 of the License, or (at | |
8 @c your option) any later version. | |
9 @c | |
10 @c Octave is distributed in the hope that it will be useful, but WITHOUT | |
11 @c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
12 @c FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
13 @c for more details. | |
14 @c | |
15 @c You should have received a copy of the GNU General Public License | |
16 @c along with Octave; see the file COPYING. If not, see | |
17 @c <http://www.gnu.org/licenses/>. | |
18 | |
19 @ifhtml | |
20 @set htmltex | |
21 @end ifhtml | |
22 @iftex | |
23 @set htmltex | |
24 @end iftex | |
25 | |
26 @node Diagonal and Permutation Matrices | |
27 @chapter Diagonal and Permutation Matrices | |
28 | |
29 @menu | |
30 * Basic Usage:: Creation and Manipulation of Diagonal and Permutation Matrices | |
31 * Matrix Algebra:: Linear Algebra with Diagonal and Permutation Matrices | |
32 * Function Support:: Functions That Are Aware of These Matrices | |
33 * Example Codes:: Some Examples of Usage | |
34 * Zeros Treatment:: The Differences in Treatment of Zero Elements | |
35 @end menu | |
36 | |
37 @node Basic Usage | |
38 @section The Creation and Manipulation of Diagonal and Permutation Matrices | |
39 | |
40 A diagonal matrix is defined as a matrix that has zero entries outside the main diagonal; | |
41 that is, @code{D(i,j) == 0} if @code{i != j}. | |
42 Most often, square diagonal matrices are considered; however, the definition can equally | |
43 be applied to nonsquare matrices, in which case we usually speak of a rectangular diagonal | |
44 matrix. For more information, see @url{http://en.wikipedia.org/wiki/Diagonal_matrix}. | |
45 | |
46 A permutation matrix is defined as a square matrix that has a single element equal to unity | |
47 in each row and each column; all other elements are zero. That is, there exists a | |
48 permutation (vector) @code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and | |
49 @code{P(i,j) == 0} otherwise. For more information, see | |
50 @url{http://en.wikipedia.org/wiki/Permutation_matrix}. | |
51 | |
52 Octave provides special treatment of real and complex rectangular diagonal matrices, | |
53 as well as permutation matrices. They are stored as special objects, using efficient | |
54 storage and algorithms, facilitating writing both readable and efficient matrix algebra | |
55 expressions in the Octave language. | |
56 | |
57 @menu | |
58 * Creating Diagonal Matrices:: | |
59 * Creating Permutation Matrices:: | |
60 * Explicit and Implicit Conversions:: | |
61 @end menu | |
62 | |
63 @node Creating Diagonal Matrices | |
64 @subsection Creating Diagonal Matrices | |
65 | |
66 The most common and easiest way to create a diagonal matrix is using the built-in | |
67 function @dfn{diag}. The expression @code{diag (@var{v})}, with @var{v} a vector, | |
68 will create a square diagonal matrix with elements on the main diagonal given | |
69 by the elements of @var{v}, and size equal to the length of @var{v}. | |
70 @code{diag (@var{v}, m, n)} can be used to construct a rectangular diagonal matrix. | |
71 The result of these expressions will be a special diagonal matrix object, rather | |
72 than a general matrix object. | |
73 | |
74 Some other built-in functions can also return diagonal matrices. Examples include | |
75 @dfn{balance} or @dfn{inv}. | |
76 | |
77 @DOCSTRING(diag) | |
78 | |
79 @node Creating Permutation Matrices | |
80 @subsection Creating Permutation Matrices | |
81 | |
82 For creating permutation matrices, Octave does not introduce a new function, but | |
83 rather overrides an existing syntax. That is, if @var{q} is a permutation vector | |
84 of length @var{n}, the expression | |
85 @example | |
86 @var{P} = eye (@var{n}) (:, @var{q}); | |
87 @end example | |
88 will create a permutation matrix - a special matrix object. | |
89 @example | |
90 eye (@var{n}) (@var{q}, :) | |
91 @end example | |
92 will also work (a row permutation matrix), as well as | |
93 @example | |
94 eye (@var{n}) (@var{q1}, @var{q2}). | |
95 @end example | |
96 Note that the identity, @code{eye (@var{n})}, is a diagonal matrix by definition, | |
97 but should work in any place where a permutation matrix is requested. | |
98 | |
99 Some other built-in functions can also return permutation matrices. Examples include | |
100 @dfn{inv} or @dfn{lu}. | |
101 | |
102 @node Explicit and Implicit Conversions | |
103 @subsection Explicit and Implicit Conversions | |
104 | |
105 The diagonal and permutation matrices are special objects in their own right. A number | |
106 of operations and built-in functions, are defined for these matrices to use special, | |
107 more efficient code than would be used for a full matrix in the same place. Examples | |
108 are given in further sections. | |
109 | |
110 To facilitate smooth mixing with full matrices, backward compatibility, and | |
111 compatibility with Matlab, the diagonal and permutation matrices should allow | |
112 any operation that works on full matrices, and will either treat is specially, | |
113 or implicitly convert themselves to full matrices. | |
114 | |
115 Instances include matrix indexing (except for extracting a single element), | |
116 indexed assignment, or applying most mapper functions, such as @dfn{exp}. | |
117 | |
118 An explicit conversion to a full matrix can be requested using the built-in | |
119 function @dfn{full}. It should also be noted that the diagonal and permutation | |
120 matrix objects will cache the result of the conversion after it is first | |
121 requested (explicitly or implicitly), so that subsequent conversions will | |
122 be very cheap. | |
123 | |
124 @node Matrix Algebra | |
125 @section Linear Algebra with Diagonal and Permutation Matrices | |
126 | |
127 As has been already said, diagonal and permutation matrices make it | |
128 possible to use efficient algorithms while preserving natural linear | |
129 algebra syntax. This section describes in detail the operations that | |
130 are treated specially when performed on these special matrix objects. | |
131 | |
132 @menu | |
133 * Expressions Involving Diagonal Matrices:: | |
134 * Expressions Involving Permutation Matrices:: | |
135 @end menu | |
136 | |
137 @node Expressions Involving Diagonal Matrices | |
138 @subsection Expressions Involving Diagonal Matrices | |
139 | |
140 Assume @var{D} is a diagonal matrix. If @var{M} is a full matrix, | |
141 then @code{@var{D}*@var{M}} will scale the rows of @var{M}. That means, | |
142 if @code{@var{S} = @var{D}*@var{M}}, then for each pair of indices | |
143 i,j it holds | |
144 @example | |
145 S(i,j) = D(i,i) * M(i,j). | |
146 @end example | |
147 Similarly, @code{M*D} will do a column scaling. | |
148 | |
149 The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}. | |
150 If @code{m < n}, then the expression @code{D*M} is equivalent to | |
151 @example | |
152 D(:,1:m) * M(1:m,:), | |
153 @end example | |
154 i.e. trailing @code{n-m} rows of @var{M} are ignored. If @code{m > n}, | |
155 then @code{D*M} is equivalent to | |
156 @example | |
157 [D(1:n,n) * M; zeros(m-n, columns (M))], | |
158 @end example | |
159 i.e. null rows are appended to the result. | |
160 The situation for right-multiplication @code{M*D} is analogous. | |
161 | |
162 The expressions @code{D \ M} and @code{M / D} perform inverse scaling. | |
163 They are equivalent to solving a diagonal (or rectangular diagonal) | |
164 in a least-squares minimum-norm sense. In exact arithmetics, this is | |
165 equivalent to multiplying by a pseudoinverse. The pseudoinverse of | |
166 a rectangular diagonal matrix is again a rectangular diagonal matrix | |
167 of the same dimensions, where each nonzero diagonal element is replaced | |
168 by its reciprocal. | |
169 The matrix division algorithms do, in fact, use division rather than | |
170 multiplication by reciprocals for better numerical accuracy; otherwise, they | |
171 honor the above definition. Note that a diagonal matrix is never truncated due | |
172 to ill-conditioning; otherwise, it would not be much useful for scaling. This | |
173 is typically consistent with linear algebra needs. A full matrix that only | |
174 happens to be diagonal (an is thus not a special object) is of course treated | |
175 normally. | |
176 | |
177 If @var{D1} and @var{D2} are both diagonal matrices, then the expressions | |
178 @example | |
179 @var{D1} + @var{D2} | |
180 @var{D1} - @var{D2} | |
181 @var{D1} * @var{D2} | |
182 @var{D1} / @var{D2} | |
183 @var{D1} \ @var{D2} | |
184 @end example | |
185 again produce diagonal matrices, provided that normal | |
186 dimension matching rules are obeyed. The relations used are same as described above. | |
187 | |
188 Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or raised | |
189 to a scalar power if it is square, producing diagonal matrix result in all cases. | |
190 | |
191 A diagonal matrix can also be transposed or conjugate-transposed, giving the expected | |
192 result. Extracting a leading submatrix of a diagonal matrix, i.e. @code{@var{D}(1:m,1:n)}, | |
193 will produce a diagonal matrix, other indexing expressions will implicitly convert to | |
194 full matrix. | |
195 | |
196 When involved in expressions with the element-by-element operators @code{.*}, | |
197 @code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will | |
198 also take place. This is not always strictly necessary but chosen to facilitate | |
199 better consistency with Matlab. | |
200 | |
201 @node Expressions Involving Permutation Matrices | |
202 @subsection Expressions Involving Permutation Matrices | |
203 | |
204 If @var{P} is a permutation matrix and @var{M} a matrix, the expression | |
205 @code{P*M} will permute the rows of @var{M}. Similarly, @code{M*P} will | |
206 yield a column permutation. | |
207 Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation. | |
208 | |
209 The previously described syntax for creating permutation matrices can actually | |
210 help an user to understand the connection between a permutation matrix and | |
211 a permuting vector. Namely, the following holds, where @code{I = eye (n)} | |
212 is an identity matrix: | |
213 @example | |
214 I(p,:) * M = (I*M) (p,:) = M(p,:) | |
215 @end example | |
216 Similarly, | |
217 @example | |
218 M * I(:,p) = (M*I) (:,p) = M(:,p) | |
219 @end example | |
220 | |
221 The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices. | |
222 | |
223 A permutation matrix can be transposed (or conjugate-transposed, which is the | |
224 same, because a permutation matrix is never complex), inverting the | |
225 permutation, or equivalently, turning a row-permutation matrix into a | |
226 column-permutation one. For permutation matrices, transpose is equivalent to | |
227 inversion, thus @code{P\M} is equivalent to @code{P'*M}. Transpose of a | |
228 permutation matrix (or inverse) is a constant-time operation, flipping only a | |
229 flag internally, and thus the choice between the two above equivalent | |
230 expressions for inverse permuting is completely up to the user's taste. | |
231 | |
232 Two permutation matrices can be multiplied or divided (if their sizes match), performing | |
233 a composition of permutations. Also a permutation matrix can be indexed by a permutation | |
234 vector (or two vectors), giving again a permutation matrix. | |
235 Any other operations do not generally yield a permutation matrix and will thus | |
236 trigger the implicit conversion. | |
237 | |
238 @node Function Support | |
239 @section Functions That Are Aware of These Matrices | |
240 | |
241 This section lists the built-in functions that are aware of diagonal and | |
242 permutation matrices on input, or can return them as output. Passed to other | |
243 functions, these matrices will in general trigger an implicit conversion. | |
244 (Of course, user-defined dynamically linked functions may also work with | |
245 diagonal or permutation matrices). | |
246 | |
247 @menu | |
248 * Diagonal Matrix Functions:: | |
249 * Permutation Matrix Functions:: | |
250 @end menu | |
251 | |
252 @node Diagonal Matrix Functions | |
253 @subsection Diagonal Matrix Functions | |
254 | |
255 @dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again | |
256 a diagonal matrix. @dfn{det} will use an efficient straightforward calculation | |
257 when given a diagonal matrix, as well as @dfn{cond}. | |
258 The following mapper functions can be applied to a diagonal matrix | |
259 without converting it to a full one: | |
260 @dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}. | |
261 A diagonal matrix can also be returned from the @dfn{balance} | |
262 and @dfn{svd} functions. | |
263 | |
264 @node Permutation Matrix Functions | |
265 @subsection Permutation Matrix Functions | |
266 | |
267 @dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its | |
268 specialness. @dfn{det} can be applied to a permutation matrix, efficiently | |
269 calculating the sign of the permutation (which is equal to the determinant). | |
270 | |
271 A permutation matrix can also be returned from the built-in functions | |
272 @dfn{lu} and @dfn{qr}, if a pivoted factorization is requested. | |
273 | |
274 @node Example Codes | |
275 @section Some Examples of Usage | |
276 | |
277 The following can be used to solve a linear system @code{A*x = b} | |
278 using the pivoted LU factorization: | |
279 @example | |
280 [L, U, P] = lu (A); ## now L*U = P*A | |
281 x = U \ L \ P*b; | |
282 @end example | |
283 | |
284 This is how you normalize columns of a matrix @var{X} to unit norm: | |
285 @example | |
286 s = norm (X, "columns"); | |
287 X = diag (s) \ X; | |
288 @end example | |
289 | |
290 The following expression is a way to efficiently calculate the sign of a | |
291 permutation, given by a permutation vector @var{p}. It will also work | |
292 in earlier versions of Octave, but slowly. | |
293 @example | |
294 det (eye (length (p))(p, :)) | |
295 @end example | |
296 | |
297 Finally, here's how you solve a linear system @code{A*x = b} | |
298 with Tikhonov regularization using SVD (a skeleton only): | |
299 @example | |
300 m = rows (A); n = columns (A); | |
301 [U, S, V] = svd (A); | |
302 ## determine the regularization factor alpha | |
303 ## ... | |
304 ## and solve | |
305 x = U'*b; | |
306 x = (S + alpha*eye (m, n)) \ x; ## this will be very efficient | |
307 x = V*x; | |
308 @end example | |
309 | |
310 @node Zeros Treatment | |
311 @section The Differences in Treatment of Zero Elements | |
312 | |
313 Making diagonal and permutation matrices special matrix objects in their own | |
314 right and the consequent usage of smarter algorithms for certain operations | |
315 implies, as a side effect, small differences in treating zeros. | |
316 The contents of this section applies also to sparse matrices, discussed in | |
317 the following chapter. | |
318 | |
319 The IEEE standard defines the result of the expressions @code{0*Inf} and | |
320 @code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the | |
321 best compromise. | |
322 Numerical software dealing with structured and sparse matrices (including | |
323 Octave) however, almost always makes a distinction between a "numerical zero" | |
324 and an "assumed zero". | |
325 A "numerical zero" is a zero value occuring in a place where any floating-point | |
326 value could occur. It is normally stored somewhere in memory as an explicit | |
327 value. | |
328 An "assumed zero", on the contrary, is a zero matrix element implied by the | |
329 matrix structure (diagonal, triangular) or a sparsity pattern; its value is | |
330 usually not stored explicitly anywhere, but is implied by the underlying | |
331 data structure. | |
332 | |
333 The primary distinction is that an assumed zero, when multiplied or divided | |
334 by any number, yields *always* a zero, even when, e.g., multiplied by @code{Inf} | |
335 or divided by @code{NaN}. | |
336 The reason for this behavior is that the numerical multiplication is not | |
337 actually performed anywhere by the underlying algorithm; the result is | |
338 just assumed to be zero. Equivalently, one can say that the part of the | |
339 computation involving assumed zeros is performed symbolically, not numerically. | |
340 | |
341 This behavior not only facilitates the most straightforward and efficient | |
342 implementation of algorithms, but also preserves certain useful invariants, | |
343 like: | |
344 @itemize | |
345 @item scalar * diagonal matrix is a diagonal matrix | |
346 @item sparse matrix / scalar preserves the sparsity pattern | |
347 @item permutation matrix * matrix is equivalent to permuting rows | |
348 @end itemize | |
349 all of these natural mathematical truths would be invalidated by treating | |
350 assumed zeros as numerical ones. | |
351 | |
352 Note that certain competing software does not strictly follow this principle | |
353 and converts assumed zeros to numerical zeros in certain cases, while not doing | |
354 so in other cases. As of today, there are no intentions to mimick such behavior | |
355 in Octave. | |
356 | |
357 Examples of effects of assumed zeros vs. numerical zeros: | |
358 @example | |
359 Inf * eye (3) | |
360 @result{} | |
361 Inf 0 0 | |
362 0 Inf 0 | |
363 0 0 Inf | |
364 | |
365 Inf * speye (3) | |
366 @result{} | |
367 Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%]) | |
368 | |
369 (1, 1) -> Inf | |
370 (2, 2) -> Inf | |
371 (3, 3) -> Inf | |
372 | |
373 Inf * full (eye (3)) | |
374 @result{} | |
375 Inf NaN NaN | |
376 NaN Inf NaN | |
377 NaN NaN Inf | |
378 | |
379 @end example | |
380 | |
381 @example | |
382 diag(1:3) * [NaN; 1; 1] | |
383 @result{} | |
384 NaN | |
385 2 | |
386 3 | |
387 | |
388 sparse(1:3,1:3,1:3) * [NaN; 1; 1] | |
389 @result{} | |
390 NaN | |
391 2 | |
392 3 | |
393 [1,0,0;0,2,0;0,0,3] * [NaN; 1; 1] | |
394 @result{} | |
395 NaN | |
396 NaN | |
397 NaN | |
398 @end example | |
399 |