正则表达式从xml节点获取所有匹配的值

时间:2011-09-02 08:06:40

标签: java regex

关注this question,它给了我第一场比赛。我希望将所有匹配变为字符串或字符串数​​组

这是我输出的一部分,我需要从中提取所有Category

<Trace Enabled="false">
        <ActiveCategories>
            <Category>ENVIRONMENT</Category>
            <Category>EXEC</Category>
            <Category>EXTERNALS</Category>
            <Category>FILESYSTEM</Category>
            <Category>INPUT_DOC</Category>
            <Category>INTERFACES</Category>
            <Category>NETWORKING</Category>
            <Category>OUTPUT_DOC</Category>
            <Category>PREPROCESSOR_INPUT</Category>
            <Category>REQUEST</Category>
            <Category>SYSTEMRESOURCES</Category>
            <Category>VIEWIO</Category>
            <Category>ALL</Category>
        </ActiveCategories>
        <SeverityLevel>ERROR</SeverityLevel>
        <MessageInfo>
            <ProcessAndThreadIds>true</ProcessAndThreadIds>
            <TimeStamp>true</TimeStamp>
        </MessageInfo>
        <TraceFile>
            <FileName>CMDS_log.txt</FileName>
            <MaxFileSize>1000000</MaxFileSize>
            <RecyclingMethod>Restart</RecyclingMethod>
        </TraceFile>
    </Trace>

现在通过以下代码,我只能抓取ENVIRONMENT,我需要获取所有Category的值

def regexFinder(String myInput,String myRegex)
{
String ResultString
Pattern regex
Matcher regexMatcher

regex = Pattern.compile(myRegex, Pattern.DOTALL);
regexMatcher = regex.matcher(myInput);
if (regexMatcher.find()) {
    ResultString = regexMatcher.group();
}
}

tempResultString=regexFinder(ResultString,"(?<=<Category>)(?:(?!</Category>).)*")
    csm.cmengine_category(tempResultString)
    {           "${rs}"     }

2 个答案:

答案 0 :(得分:3)

不要使用正则表达式来解析XML,使用解析器。

请参阅RegEx match open tags except XHTML self-contained tags

答案 1 :(得分:1)

您需要反复应用.find()来迭代所有结果:

Matcher regexMatcher = regex.matcher(myInput);
List<String> matchList = new ArrayList<String>();
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group());
}