Java Map replaceAll具有多个String匹配项

时间:2013-12-03 01:49:54

标签: java regex string replace

我有以下程序,我想替换所有出现的字符串,其中一个单词作为地图中的键存在,并带有相应的值。

我已经实现了4种方法。它们中的每一个都以不同的方式执行大致相同的功能。前3个的输出不正确,因为下一个替换会覆盖前一个的结果。第四个工作,但只是因为我正在替换整个字符串中的单个字符。无论如何,这是非常低效的,因为我只检查整个字符串的子字符串。

有没有办法安全地替换所有事件而不会覆盖以前的替换?

我注意到Apache有StringUtils.replaceEach()方法,但我更喜欢使用地图。

输出:

Apple BApplenApplenApple CApplentApplelope DApplete Apple BApplenApplenApple CApplentApplelope DApplete
Apple BApplenApplenApple CApplentApplelope DApplete Apple BApplenApplenApple CApplentApplelope DApplete
Apple BApplenApplenApple CApplentApplelope DApplete Apple BApplenApplenApple CApplentApplelope DApplete
Apple Banana Cantalope Date Apple Banana Cantalope Date

ReplaceMap.java

import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ReplaceMap {
    private static Map<String, String> replacements;

    static {
        replacements = new HashMap<String, String>();
        replacements.put("a", "Apple");
        replacements.put("b", "Banana");
        replacements.put("c", "Cantalope");
        replacements.put("d", "Date");
    }

    public ReplaceMap() {
        String phrase = "a b c d a b c d";

        System.out.println(mapReplaceAll1(phrase, replacements));
        System.out.println(mapReplaceAll2(phrase, replacements));
        System.out.println(mapReplaceAll3(phrase, replacements));
        System.out.println(mapReplaceAll4(phrase, replacements));
    }

    public String mapReplaceAll1(String str, Map<String, String> replacements) {
        for (Map.Entry<String, String> entry : replacements.entrySet()) {
            str = str.replaceAll(entry.getKey(), entry.getValue());
        }

        return str;
    }

    public String mapReplaceAll2(String str, Map<String, String> replacements) {
        for (String key : replacements.keySet()) {
            str = str.replaceAll(Pattern.quote(key),
                    Matcher.quoteReplacement(replacements.get(key)));
        }

        return str;
    }

    public String mapReplaceAll3(String str, Map<String, String> replacements) {        
        String regex = new StringBuilder("(")
            .append(join(replacements.keySet(), "|")).append(")").toString();
        Matcher matcher = Pattern.compile(regex).matcher(str);

        while (matcher.find()) {
            str = str.replaceAll(Pattern.quote(matcher.group(1)),
                    Matcher.quoteReplacement(replacements.get(matcher.group(1))));
        }

        return str;
    }

    public String mapReplaceAll4(String str, Map<String, String> replacements) {        
        StringBuilder buffer = new StringBuilder();
        String regex = new StringBuilder("(")
            .append(join(replacements.keySet(), "|")).append(")").toString();
        Pattern pattern = Pattern.compile(regex);

        for (int i = 0, j = 1; i < str.length(); i++, j++) {
            String s = str.substring(i, j);
            Matcher matcher = pattern.matcher(s);


            if (matcher.find()) {
                buffer.append(s.replaceAll(Pattern.quote(matcher.group(1)),
                            Matcher.quoteReplacement(replacements.get(matcher.group(1)))));
            } else {
                buffer.append(s);
            }
        }


        return buffer.toString();
    }

    public static String join(Collection<String> s, String delimiter) {
        StringBuilder buffer = new StringBuilder();
        Iterator<String> iter = s.iterator();
        while (iter.hasNext()) {
            buffer.append(iter.next());
            if (iter.hasNext()) {
                buffer.append(delimiter);
            }
        }
        return buffer.toString();
    }

    public static void main(String[] args) {
        new ReplaceMap();
    }
}

3 个答案:

答案 0 :(得分:1)

我的方法如下。可能有更快的解决方案,但如果你喜欢这个想法,你可以更进一步。

public String mapReplaceAll5(String str, Map<String, String> replacements) {
    Map<String, String> origToMarker = new HashMap<String, String>();
    Map<String, String> markerToRepl = new HashMap<String, String>();
    char c = 32000;
    for(Entry<String, String> e : replacements.entrySet()) {
        origToMarker.put(e.getKey(), String.valueOf(c));
        markerToRepl.put(String.valueOf(c--), e.getValue());
    }
    for (Map.Entry<String, String> entry : origToMarker.entrySet()) {
        str = str.replaceAll(entry.getKey(), entry.getValue());
    }
    for (Map.Entry<String, String> entry : markerToRepl.entrySet()) {
        str = str.replaceAll(entry.getKey(), entry.getValue());
    }

    return str;
}

答案 1 :(得分:1)

我这样做了:

replace(str, map)
    if we have the empty string, the result is the empty string.
    if the string starts with one of the keys from the map:
        the result is the replacement associated with that key + replace(str', map)
             where str' is the substring of str after the key
    otherwise the result is the first character of str + replace(str', map)
             where str' is the substring of str without the first character

请注意,尽管递归地计算,它可以(并且应该,由于Javas臭名昭着的小堆栈空间)实现为循环并将结果的第一部分(即替换字符串或第一个字符)写入字符串生成器

如果地图中的某个键是某个其他键的前缀(即“键”,“键”),则可能需要以递减的长度尝试键。

进一步注意,可以设计出更快的算法,使用Tries而不是HasMaps。这也可以解决模棱两可的关键问题。

这是一个大纲(未经测试):

public static String replace(String it, Map<String, String> map) {
    StringBuilder sb = new StringBuilder();
    List<String> keys = map.keySet();      // TODO: sort by decreasing length!!
    next: while (it.length() > 0) {
        for (String k : keys) {
            if (it.startsWith(k)) {
                // we have a match!
                sb.append(map.get(k));
                it = it.substring(k.length(), it.length());
                continue next;
            }
        }
        // no match, advance one character
        sb.append(it.charAt(0));
        it = it.substring(1, it.length());
    }
    return sb.toString();
}

答案 2 :(得分:0)

您可以在地图中使用StringUtils.replaceEach,但代价是将数据复制到一对数组中。

public String replaceEach(String s, Map<String, String> replacements)
{
    int size = replacements.size();
    String[] keys = replacements.keySet().toArray(new String[size]);
    String[] values = replacements.values().toArray(new String[size]);
    return StringUtils.replaceEach(s, keys, values);
}

建议使用LinkedHashMap,以便迭代顺序定义明确,但我怀疑使用HashMap可以正常工作。