正则表达式获取xml节点字符串值

时间:2011-08-30 08:35:34

标签: regex

我有一个输出,我想要获取CMEngine节点的值,即CMEngine节点内的所有内容。请帮我一个正则表达式,我已经有一个使用正则表达式的java代码,所以我只需要正则表达式。感谢

我的XML

<General>
    <LanguageID>en_US</LanguageID>
<CMEngine>
    <CMServer/> <!-- starting here -->
    <DaysToKeepHistory>4</DaysToKeepHistory>
    <PreprocessorMaxBuf>5000000</PreprocessorMaxBuf>
    <ServiceRefreshInterval>30</ServiceRefreshInterval>
    <ReuseMemoryBetweenRequests>true</ReuseMemoryBetweenRequests>
    <Trace Enabled="false">
        <ActiveCategories>
            <Category>ENVIRONMENT</Category>
            <Category>EXEC</Category>
            <Category>EXTERNALS</Category>
            <Category>FILESYSTEM</Category>
            <Category>INPUT_DOC</Category>
            <Category>INTERFACES</Category>
            <Category>NETWORKING</Category>
            <Category>OUTPUT_DOC</Category>
            <Category>PREPROCESSOR_INPUT</Category>
            <Category>REQUEST</Category>
            <Category>SYSTEMRESOURCES</Category>
            <Category>VIEWIO</Category>
        </ActiveCategories>
        <SeverityLevel>ERROR</SeverityLevel>
        <MessageInfo>
            <ProcessAndThreadIds>true</ProcessAndThreadIds>
            <TimeStamp>true</TimeStamp>
        </MessageInfo>
        <TraceFile>
            <FileName>CMEngine_log.txt</FileName>
            <MaxFileSize>1000000</MaxFileSize>
            <RecyclingMethod>Restart</RecyclingMethod>
        </TraceFile>
    </Trace>
    <JVMLocation>C:\Informatica\9.1.0\java\jre\bin\server</JVMLocation>
    <JVMInitParamList/>  <!-- Ending here -->
</CMEngine>
</General>

1 个答案:

答案 0 :(得分:2)

如果必须是正则表达式,并且每个字符串只有一个CMEngine标记:

Pattern regex = Pattern.compile("(?<=<CMEngine>)(?:(?!</CMEngine>).)*", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
    ResultString = regexMatcher.group();
}

由于该输出似乎是由机器生成的,并且不太可能包含可能会混淆正则表达式的注释或其他内容,因此这应该非常可靠。

它从<CMEngine>代码后的某个位置开始:(?<=<CMEngine>)
并匹配所有字符,直到下一个</CMEngine>代码:(?:(?!</CMEngine>).)*