Mercurial > hg > octave-nkf
annotate scripts/strings/strsplit.m @ 14138:72c96de7a403 stable
maint: update copyright notices for 2012
author | John W. Eaton <jwe@octave.org> |
---|---|
date | Mon, 02 Jan 2012 14:25:41 -0500 |
parents | 9cae456085c2 |
children | 4d917a6a858b |
rev | line source |
---|---|
14138
72c96de7a403
maint: update copyright notices for 2012
John W. Eaton <jwe@octave.org>
parents:
13929
diff
changeset
|
1 ## Copyright (C) 2009-2012 Jaroslav Hajek |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
2 ## |
11104 | 3 ## This file is part of Octave. |
4 ## | |
5 ## Octave is free software; you can redistribute it and/or modify it | |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
6 ## under the terms of the GNU General Public License as published by |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
7 ## the Free Software Foundation; either version 3 of the License, or (at |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
8 ## your option) any later version. |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
9 ## |
11104 | 10 ## Octave is distributed in the hope that it will be useful, but |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
11 ## WITHOUT ANY WARRANTY; without even the implied warranty of |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
12 ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
13 ## General Public License for more details. |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
14 ## |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
15 ## You should have received a copy of the GNU General Public License |
11104 | 16 ## along with Octave; see the file COPYING. If not, see |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
17 ## <http://www.gnu.org/licenses/>. |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
18 |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
19 ## -*- texinfo -*- |
13776
eb12d5d5c7b9
strsplit.m: Use S instead of P to denote string argument (Bug #34709).
Rik <octave@nomad.inbox5.com>
parents:
13775
diff
changeset
|
20 ## @deftypefn {Function File} {[@var{cstr}] =} strsplit (@var{s}, @var{sep}) |
eb12d5d5c7b9
strsplit.m: Use S instead of P to denote string argument (Bug #34709).
Rik <octave@nomad.inbox5.com>
parents:
13775
diff
changeset
|
21 ## @deftypefnx {Function File} {[@var{cstr}] =} strsplit (@var{s}, @var{sep}, @var{strip_empty}) |
eb12d5d5c7b9
strsplit.m: Use S instead of P to denote string argument (Bug #34709).
Rik <octave@nomad.inbox5.com>
parents:
13775
diff
changeset
|
22 ## Split the string @var{s} using one or more separators @var{sep} and return |
eb12d5d5c7b9
strsplit.m: Use S instead of P to denote string argument (Bug #34709).
Rik <octave@nomad.inbox5.com>
parents:
13775
diff
changeset
|
23 ## a cell array of strings. Consecutive separators and separators at |
8884 | 24 ## boundaries result in empty strings, unless @var{strip_empty} is true. |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
25 ## The default value of @var{strip_empty} is false. |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
26 ## |
13776
eb12d5d5c7b9
strsplit.m: Use S instead of P to denote string argument (Bug #34709).
Rik <octave@nomad.inbox5.com>
parents:
13775
diff
changeset
|
27 ## 2-D character arrays are split at separators and at the original column |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
28 ## boundaries. |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
29 ## |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
30 ## Example: |
13929
9cae456085c2
Grammarcheck of documentation before 3.6.0 release.
Rik <octave@nomad.inbox5.com>
parents:
13776
diff
changeset
|
31 ## |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
32 ## @example |
13929
9cae456085c2
Grammarcheck of documentation before 3.6.0 release.
Rik <octave@nomad.inbox5.com>
parents:
13776
diff
changeset
|
33 ## @group |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
34 ## strsplit ("a,b,c", ",") |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
35 ## @result{} |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
36 ## @{ |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
37 ## [1,1] = a |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
38 ## [1,2] = b |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
39 ## [1,3] = c |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
40 ## @} |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
41 ## |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
42 ## strsplit (["a,b" ; "cde"], ",") |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
43 ## @result{} |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
44 ## @{ |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
45 ## [1,1] = a |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
46 ## [1,2] = b |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
47 ## [1,3] = cde |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
48 ## @} |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
49 ## @end group |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
50 ## @end example |
8884 | 51 ## @seealso{strtok} |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
52 ## @end deftypefn |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
53 |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
54 function cstr = strsplit (s, sep, strip_empty = false) |
8884 | 55 |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
56 if (nargin < 2 || nargin > 3) |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
57 print_usage (); |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
58 elseif (! ischar (s) || ! ischar (sep)) |
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
59 error ("strsplit: S and SEP must be string values"); |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
60 elseif (! isscalar (strip_empty)) |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
61 error ("strsplit: STRIP_EMPTY must be a scalar value"); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
62 endif |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
63 |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
64 if (isempty (s)) |
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
65 cstr = cell (size (s)); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
66 else |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
67 if (rows (s) > 1) |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
68 ## For 2-D arrays, add separator character at line boundaries |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
69 ## and transform to single string |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
70 s(:, end+1) = sep(1); |
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
71 s = reshape (s.', 1, numel (s)); |
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
72 s(end) = []; |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
73 endif |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
74 |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
75 ## Split s according to delimiter |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
76 if (isscalar (sep)) |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
77 ## Single separator |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
78 idx = find (s == sep); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
79 else |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
80 ## Multiple separators |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
81 idx = strchr (s, sep); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
82 endif |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
83 |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
84 ## Get substring lengths. |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
85 if (isempty (idx)) |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
86 strlens = length (s); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
87 else |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
88 strlens = [idx(1)-1, diff(idx)-1, numel(s)-idx(end)]; |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
89 endif |
8884 | 90 ## Remove separators. |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
91 s(idx) = []; |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
92 if (strip_empty) |
8884 | 93 ## Omit zero lengths. |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
94 strlens = strlens(strlens != 0); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
95 endif |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
96 |
8884 | 97 ## Convert! |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
98 cstr = mat2cell (s, 1, strlens); |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
99 endif |
8884 | 100 |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
101 endfunction |
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
102 |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
103 |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
104 %!assert (strsplit ("road to hell", " "), {"road", "to", "hell"}) |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
105 %!assert (strsplit ("road to^hell", " ^"), {"road", "to", "hell"}) |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
106 %!assert (strsplit ("road to--hell", " -", true), {"road", "to", "hell"}) |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
107 %!assert (strsplit (["a,bc";",de"], ","), {"a", "bc", ones(1,0), "de "}) |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
108 %!assert (strsplit (["a,bc";",de"], ",", true), {"a", "bc", "de "}) |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
109 %!assert (strsplit (["a,bc";",de"], ", ", true), {"a", "bc", "de"}) |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
110 |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
111 %% Test input validation |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
112 %!error strsplit () |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
113 %!error strsplit ("abc") |
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
114 %!error strsplit ("abc", "b", true, 4) |
13775
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
115 %!error <S and SEP must be string values> strsplit (123, "b") |
73b2b3ca6524
strsplit.m: Use S instead of P to denote string argument (Bug #"a
Rik <octave@nomad.inbox5.com>
parents:
13701
diff
changeset
|
116 %!error <S and SEP must be string values> strsplit ("abc", 1) |
13701
46e68badedb8
strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents:
12915
diff
changeset
|
117 %!error <STRIP_EMPTY must be a scalar value> strsplit ("abc", "def", ones(3,3)) |
8877
2c8b2399247b
implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff
changeset
|
118 |