Tue
28
Dec '04
|
|
A little quiz - what’s the output of this little code snippet?
Pattern p = Pattern.compile("aaa"); System.out.println(p.matcher("bbbaaabbb").matches()); System.out.println(p.matcher("bbbaaabbb").replaceAll("ccc"));
IMHO it should be
true bbbcccbbb
but it is
false bbbcccbbb
Does anybody else find this strange?
Addon for clarity: Yes, I know this is how the Javadoc says it’s supposed to be. That doesn’t mean I like it that way. See in Sonja’s Comment “pretty common source of favorite regex bugs”…
No not quite, java regex’s have always required the
.*
at the boundaries of a regex. A java regex always has to match the complete string. A pretty common source of favorite regex bugs … What I am too lazy to read up (no API on my windows) is what the replaceAll() method does. IMHO it should readI could be wrong, I am on vacation
no, the result is false, bbbcccbbb as I stated. That’s confusing, at least to me. replaceAll behaves like ‘normal’, perl-like matching but matches has to match the whole string…
RTFM:
“matches(): Attempts to match the entire (!!!) input sequence against the pattern”
“replaceAll() : Replaces every subsequence (!!!) of the input sequence that matches the pattern with the given replacement string
Yes, I know this is consistent with the javadoc. What I’m saying is that this is still not a good idea. matches() and replaceAll(…) are so similar, they should behave in the same way, e.g. like Perl.
if you think this would be rather cool if it worked as described here, but it doesn’t work for you, the reason may be your test String contains linebreaks etc.
To tell java to treat them as normal characters, use this:
Pattern p = Pattern.compile(”.\*aaa.\*”, Pattern.DOTALL + Pattern.MULTILINE);
(and yes, it’s a valid email address )