Tue
28
Dec '04
java.util.regex.*
by Frank Spychalski filed under Java, Rants, Work

A little quiz - what’s the output of this little code snippet?

Pattern p = Pattern.compile("aaa");
System.out.println(p.matcher("bbbaaabbb").matches());
System.out.println(p.matcher("bbbaaabbb").replaceAll("ccc"));


IMHO it should be

true
bbbcccbbb

but it is

false
bbbcccbbb

Does anybody else find this strange?

Addon for clarity: Yes, I know this is how the Javadoc says it’s supposed to be. That doesn’t mean I like it that way. See in Sonja’s Comment “pretty common source of favorite regex bugs”


5 Responses to “java.util.regex.*”

  1. 1

    No not quite, java regex’s have always required the .* at the boundaries of a regex. A java regex always has to match the complete string. A pretty common source of favorite regex bugs … What I am too lazy to read up (no API on my windows) is what the replaceAll() method does. IMHO it should read

    false
    bbbaaabbb

    I could be wrong, I am on vacation ;-)

    Sonja (December 28th, 2004 at 12:53)
  2. 2

    no, the result is false, bbbcccbbb as I stated. That’s confusing, at least to me. replaceAll behaves like ‘normal’, perl-like matching but matches has to match the whole string…

    Frank Spychalski (December 28th, 2004 at 13:20)
  3. 3

    RTFM:
    “matches(): Attempts to match the entire (!!!) input sequence against the pattern”
    “replaceAll() : Replaces every subsequence (!!!) of the input sequence that matches the pattern with the given replacement string

    Uli (December 28th, 2004 at 15:00)
  4. 4

    Yes, I know this is consistent with the javadoc. What I’m saying is that this is still not a good idea. matches() and replaceAll(…) are so similar, they should behave in the same way, e.g. like Perl.

    Frank Spychalski (December 28th, 2004 at 15:35)
  5. 5

    if you think this would be rather cool if it worked as described here, but it doesn’t work for you, the reason may be your test String contains linebreaks etc.
    To tell java to treat them as normal characters, use this:
    Pattern p = Pattern.compile(”.\*aaa.\*”, Pattern.DOTALL + Pattern.MULTILINE);
    (and yes, it’s a valid email address ;-) )

    schlumpf (January 7th, 2005 at 14:56)

Any comments? Or questions? Just leave a Reply: