# HG changeset patch # User Stefan Monnier # Date 980377900 0 # Node ID ad24d342cb6a0f5d82809c3ed2d2885ad92f427b # Parent 02d9b2c9558f69adf5417bd89e8da325e02485d6 (mutually_exclusive_p): Don't blindly handle `charset_not' as if it was a `charset'. diff --git a/regex.c b/regex.c --- a/regex.c +++ b/regex.c @@ -4263,7 +4263,7 @@ { register re_wchar_t c = (re_opcode_t) *p2 == endline ? '\n' - : RE_STRING_CHAR(p2 + 2, pend - p2 - 2); + : RE_STRING_CHAR (p2 + 2, pend - p2 - 2); if ((re_opcode_t) *p1 == exactn) { @@ -4308,13 +4308,11 @@ break; case charset: - case charset_not: { if ((re_opcode_t) *p1 == exactn) /* Reuse the code above. */ return mutually_exclusive_p (bufp, p2, p1); - /* It is hard to list up all the character in charset P2 if it includes multibyte character. Give up in such case. */ @@ -4330,7 +4328,7 @@ P2 is ASCII, it is enough to test only bitmap table of P1. */ - if (*p1 == *p2) + if ((re_opcode_t) *p1 == charset) { int idx; /* We win if the charset inside the loop @@ -4349,8 +4347,7 @@ return 1; } } - else if ((re_opcode_t) *p1 == charset - || (re_opcode_t) *p1 == charset_not) + else if ((re_opcode_t) *p1 == charset_not) { int idx; /* We win if the charset_not inside the loop lists @@ -4370,6 +4367,22 @@ } } + case charset_not: + switch (SWITCH_ENUM_CAST (*p1)) + { + case exactn: + case charset: + /* Reuse the code above. */ + return mutually_exclusive_p (bufp, p2, p1); + case charset_not: + /* When we have two charset_not, it's very unlikely that + they don't overlap. The union of the two sets of excluded + chars should cover all possible chars, which, as a matter of + fact, is virtually impossible in multibyte buffers. */ + ; + } + break; + case wordend: case notsyntaxspec: return ((re_opcode_t) *p1 == syntaxspec