仅返回双引号内出现的序列?

时间:2018-01-14 15:37:28

标签: java

我想编写一个方法,只返回双引号内出现的字符串中的字符。鉴于String input = "\"x\" y \"z\"",我想返回"xz"

以下方法仅返回"x",因为模式匹配器只找到一个匹配项。

  static String removeCharsNotInQuotes(String text) {
        StringBuilder builder = new StringBuilder();

        String withinQuotesRegex = "\"(.*?)\"";

        Pattern pattern = Pattern.compile(withinQuotesRegex);
        Matcher matcher = pattern.matcher(text);
        if (matcher.find()) {
            for (int i = 1; i <= matcher.groupCount(); i++) {
                builder.append(matcher.group(i));
                builder.append(" ");
            }
        }

        return builder.toString().trim();
    }

2 个答案:

答案 0 :(得分:3)

您应该为matcher.find()使用while循环。例如:

while (matcher.find()) {
    builder.append(matcher.group(1));
    builder.append(" ");
}

此外,matcher.groupCount()会返回正则表达式组的数量,因此使用它只适用于您没有的多个组。

调用group()的整数参数表示您要访问的正则表达集中的哪些组,对您而言始终是1,因为您只有while一组。

答案 1 :(得分:3)

使用while (matcher.find()) { // you don't need to use a for loop here. You just need group 1 builder.append(matcher.group(1)); // given your sample output, you don't seem to want a space between // the stuff in each pair of quotes. builder.append(" "); } 循环继续查找新匹配项,直到用完为止:

find

您似乎对import re to_split = """ 'email = "foo@mail.com, info = "lalsdfaslsad"))' 'email = "bar@mail.com, info = "lasdfl1241aslsad"))' 'email = "foooo@3robosa.com, info = "lalsdfsdfas241lsad"))' 'email = "foooodf@sdfrobosa.com, info = "ladsfsdflas241lsad"))' 'email = "foooo@dsdfrobosa.com, info = "lalas241lsad"))' 'email = "foooo@ferobosa.com, info = "lalas241lsad"))' 'email = "foooo@rodbosa.com, info = "lalas241lsad"))' 'email = "foooo@rodsfa.com, info = "laldsfsdfas241lsad"))' """.splitlines() def split_it(s, e): ms = 10 for line in s: for item in e.split(':'): if line.find(item) != -1: l_chars = re.escape(line[line.rfind(item) - ms:line.rfind(item)]) r_chars = re.escape(line[line.rfind(item) + len(item):line.rfind(item) + len(item) + ms]) if l_chars and r_chars: for line2 in s: regex = r'{}(.+?){}'.format(l_chars, r_chars) if re.search(regex, line2): print re.search(regex, line2).group(1) expected_output = 'foo@mail.com:lalsdfaslsad' split_it(to_split, expected_output) 的作用感到困惑。它找不到所有匹配项并将每个匹配项放在一个组中。它只会将匹配器的状态变为它找到的下一个匹配。