Matcher.replaceAll(Function<MatchResult, String> f) [was: Re: hg: lambda/lambda/jdk: Pattern.splitAsStream.] (original) (raw)

Paul Sandoz paul.sandoz at oracle.com
Mon Apr 22 05:54:20 PDT 2013


Hi Jürgen,

Three issues:

While these are nice to have i am not sure they carry their weight given the time constraints we have. If you can help us provide a more complete solution the better chance we have of getting this into JDK8.

Thanks, Paul.

On Apr 19, 2013, at 12:59 AM, jk at blackdown.de wrote:

Hi Paul,

Paul Sandoz <paul.sandoz at oracle.com> writes:

Hi Jürgen,

That seems useful as a more general approach than Matcher.replaceAll(String ) e.g. Matcher.replaceAll(Function<MatchResult, String> f) Ben, thoughts? like this? # HG changeset patch # User Jürgen Kreileder <jk at blackdown.de> # Date 1366322703 -7200 # Node ID 59766f458701af5fbb23d195dd48a928505f3306 # Parent 3ec06ef568a8ded0a7ecc7624df9d3a025dad6bc Matcher.replaceAll(Function<MatchResult, String> f) diff --git a/src/share/classes/java/util/regex/Matcher.java b/src/share/classes/java/util/regex/Matcher.java --- a/src/share/classes/java/util/regex/Matcher.java +++ b/src/share/classes/java/util/regex/Matcher.java @@ -25,6 +25,7 @@ package java.util.regex; +import java.util.function.Function; /** * An engine that performs match operations on a {@link java.lang.CharSequence @@ -916,6 +917,54 @@ } /** + * Replaces every subsequence of the input sequence that matches the + * pattern with the string returned by the given replacement function. + * + *

This method first resets this matcher. It then scans the input

+ * sequence looking for matches of the pattern. Characters that are not + * part of any match are appended directly to the result string; each match + * is replaced in the result by the string returned by the replacement + * function. The replacement strings may contain references to captured + * subsequences as in the {@link #appendReplacement appendReplacement} + * method. + * + *

Note that backslashes (</tt>) and dollar signs ($) in

+ * the string returned by the replacement function may cause the results to + * be different than if they were being treated as a literal strings. Dollar + * signs may be treated as references to captured subsequences as described + * above, and backslashes are used to escape literal characters in the + * replacement string. + * + *

Given the regular expression (\w)(\w*), the input

+ * "paTTern maTcher", and the replacement function + * m -> m.group(1).toUpperCase() + m.group(2).toLowerCase(), an + * invocation of this method on a matcher for that expression would yield + * the string "Pattern Matcher".

+ * + *

Invoking this method changes this matcher's state. If the matcher

+ * is to be used in further matching operations then it should first be + * reset.

+ * + * @param f + * The function providing replacement strings + * @return The string constructed by replacing each matching subsequence + * by the replacement string provide by the given function, + * substituting captured subsequences as needed + * @since 1.8 + */ + public String replaceAll(Function<MatchResult, String> f) { + reset(); + if (find()) { + StringBuffer sb = new StringBuffer(); + do { + appendReplacement(sb, f.apply(this)); + } while (find()); + return appendTail(sb).toString(); + } + return text.toString(); + } + + /** * Replaces the first subsequence of the input sequence that matches the * pattern with the given replacement string. * == Juergen On Apr 8, 2013, at 6:59 PM, jk at blackdown.de wrote:

Hi Paul,

it would be nice if Pattern/Matcher offered a terse way to loop over all matches in a string and replace them via a callback. E.g. I'm currently using something like this: private static final PatternAndReplacement PASS2 = new PatternAndReplacement( Pattern.compile(" ( " + " \A \p{Punct}*" // start of title… + " |" + " [:.;?!]\ +" // or of subsentence… + " | " + " \ ['"“‘(\[] \ *" // or of inserted subphrase… + ")" + "(" + SMALLWORDS + ") \b", // … followed by small word Pattern.COMMENTS | Pattern.CASEINSENSITIVE | Pattern.UNICODECHARACTERCLASS), m -> Matcher.quoteReplacement(m.group(1) + capitalize(m.group(2)))); with PatternAndReplacement being private static class PatternReplacement implements Function<String, String> { private final Pattern pattern; private final Function<MatchResult, String> function; PatternReplacement(final Pattern p, final Function<MatchResult, String> f) { pattern = p; function = f; } @Override public final String apply(final String s) { Matcher m = pattern.matcher(s); if (m.find()) { StringBuffer sb = new StringBuffer(s.length()); do { m.appendReplacement(sb, function.apply(m)); } while (m.find()); return m.appendTail(sb).toString(); } return s; } } Any plans for something like this?

Jürgen paul.sandoz at oracle.com writes: Changeset: 526131346981 Author: psandoz Date: 2013-04-08 17:16 +0200 URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/526131346981 Pattern.splitAsStream. Contributed-by: Ben Evans <benjamin.john.evans at gmail.com> ! src/share/classes/java/util/regex/Pattern.java + test-ng/tests/org/openjdk/tests/java/util/regex/PatternTest.java -- https://blackdown.de/



More information about the lambda-dev mailing list