分离自定义标签'在java中使用正则表达式的内容

时间:2016-04-06 18:25:05

标签: java regex

我需要一段代码,它将获取Java中String中的标记所包围的所有值,并将其作为字符串数组返回(如果标记为'名称匹配一系列关键字。这些标签都只是普通的文字,被"<>"和#34;"包围的结束标记对于每个创建的标签。

实施例。阅读文本 -

  <name>stuff<name/>
  <locations>example of text<locations/>
  <storybattles>more text somehow<storybattles/>
  <maincharacter>characters n stuff <maincharacter/>
//continues on with random tag text values

返回 -

"stuff"
"example of text"
"more text somehow"
"characters n stuff"

优选用例 -

String inputText="pretend there are tags in here";
//Please pretend I added several keywordsd to the keywords list
ArrayList<String> keywords=new ArrayList<String>(); 
String[] allTheAnswers=kindStackOverflowMentorMethod(inputText,keywords);

虽然我可以凭借对Regex的有限知识独自完成这项工作,但我只是因为我知道这可以做得更好。如果你为你使用的正则表达式的每个部分包含一个解释(或者一个聪明的头脑可能做的任何解决方案),那么你从我那里得到额外的分数。

2 个答案:

答案 0 :(得分:0)

以下是我将如何做的一个工作示例:

private static final String DATA = "<name>stuff<name/>\n" +
        "  <locations>example of text<locations/>\n" +
        "  <storybattles>more text somehow<storybattles/>\n" +
        "  <maincharacter>characters n stuff <maincharacter/>";

private static final List<String> KEYWORDS = Arrays.asList(
        new String[]{"name", "locations"});

private static final String PATTERN = "<%1$s>(.+?)<%1$s/>";

public static void main(String[] args) {

    List<String> strs = new ArrayList<>();
    for (String keyword : KEYWORDS) {
        String tempPattern = String.format(PATTERN, keyword);
        Pattern pattern = Pattern.compile(tempPattern);
        Matcher matcher = pattern.matcher(DATA);

        while(matcher.find()){
            strs.add(matcher.group(1));
        }
    }
}

Regex101 Fiddle

答案 1 :(得分:0)

你在找这个吗?

import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;


public static void main(String[] args) {

    String inputText=" <name>stuff<name/>\n"+
        " <locations>example of text<locations/>\n"+
        " <storybattles>more text somehow<storybattles/>\n"+
        " <maincharacter>characters n stuff <maincharacter/>";

    //Please pretend I added several keywordsd to the keywords list
    ArrayList<String> keywords=new ArrayList<>(); 
    keywords.add("locations");
    keywords.add("maincharacter");

    //Call the function
    ArrayList<String> allTheAnswers=kindStackOverflowMentorMethod(inputText,keywords);

}

public static ArrayList<String> kindStackOverflowMentorMethod(String inputText, ArrayList<String> keywords){
    ArrayList<String> values=new ArrayList<>();
    Matcher m = Pattern.compile("<([a-z][a-z0-9]*)>(.*?)<(?:\\1)\\/>").matcher(inputText);
    while (m.find()){
        if (keywords.indexOf(m.group(1)) > -1)  {
            values.add(m.group(2));            
        }
    }
    return values;
}

REGEX EXPLANATION

<                   # match < literally
([a-z][a-z0-9]*)    # first capturing group - match TAG name
                      should start with a letter, followed by
                      0 or more letters or numbers
>                   # match > literally 
(.*?)               # 2nd capturing group - match content surrounded by TAGs
                      non-greedy match
<                   # match < literally
(?:\1)              # non-capturing group - match previous matched TAG name
\/>                 # match /> literally