从错误的xml字符串中提取值

时间:2015-10-22 09:30:52

标签: regex xml

首先,我要非常感谢本论坛的所有人。 我试图在下面给出的xml字符串中提取一个xml标记值。

输入字符串是: -

<?xml version="1.0" encoding="UTF-8"?>
<nonpublicExecutionReportAcknowledgement xmlns="http://www.fpml.org/FpML-5/recordkeeping" fpmlVersion="5-5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.fpml.org/FpML-5/recordkeeping /../xmls/SDR/recordkeeping/fpml-main-5-5.xsd">
    <header>
        <inReplyTo messageIdScheme="www.abc.com/msg_id">sit:GDS:1644644:1442512894123:SRD0IFR119094084</inReplyTo>
        <sentBy>DTCCEU</sentBy>
        <sendTo>RRasdfjasdfasdkllkd4</sendTo>
        <creationTimestamp>2015-10-14T16:47:30Z</creationTimestamp>
    </header>
    <originalMessage>
        <nonpublicExecutionReport fpmlVersion="5-5" xmlns="http://www.fpml.org/FpML-5/recordkeeping" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <header>                
                <messageId messageIdScheme="www.abc.com/msg_id">sit:GDS:1644644:1442512894123:SRD0IFR119094084</messageId>
                <sentBy messageAddressScheme="http://www.fpml.org/coding-scheme/external/cftc/interim-compliant-identifier">RRasdfjasdfasdkllkd4</sentBy>
                <sendTo>DTCCEU</sendTo>
                <creationTimestamp>2015-10-14T05:54:38Z</creationTimestamp>
            </header>
        </nonpublicExecutionReport>
    </originalMessage>
</nonpublicExecutionReportAcknowledgement>

正则表达式i用于提取messageId是“(?&lt; =&gt;)。* ?.(?= \&lt; / messageId)”这是正常工作。但是当它作为单个xml字符串出现时,它没有按预期工作。

以下输入字符串失败。

<?xml version="1.0" encoding="UTF-8"?><nonpublicExecutionReportAcknowledgement xmlns="http://www.fpml.org/FpML-5/recordkeeping" fpmlVersion="5-5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.fpml.org/FpML-5/recordkeeping /../xmls/SDR/recordkeeping/fpml-main-5-5.xsd"><header><inReplyTo messageIdScheme="www.abc.com/msg_id">sit:GDS:1644644:1442512894123:SRD0IFR119094084</inReplyTo><sentBy>DTCCEU</sentBy><sendTo>RRasdfjasdfasdkllkd4</sendTo>   <creationTimestamp>2015-10-14T16:47:30Z</creationTimestamp></header><originalMessage><nonpublicExecutionReport fpmlVersion="5-5" xmlns="http://www.fpml.org/FpML-5/recordkeeping" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><header>                <messageId messageIdScheme="www.abc.com/msg_id">sit:GDS:1644644:1442512894123:SRD0IFR119094084</messageId><sentBy messageAddressScheme="http://www.fpml.org/coding-scheme/external/cftc/interim-compliant-identifier">RRasdfjasdfasdkllkd4</sentBy><sendTo>DTCCEU</sendTo><creationTimestamp>2015-10-14T05:54:38Z</creationTimestamp></header></nonpublicExecutionReport>   </originalMessage></nonpublicExecutionReportAcknowledgement>

所需输出为: - “sit:GDS:1644644:1442512894123:SRD0IFR119094084”

the value should get extracted between 
<messageId messageIdScheme="www.abc.com/msg_id"> and </messageId>

请您帮我构建上述问题的正则表达式字符串,这将是一个很好的帮助。

干杯, KS

1 个答案:

答案 0 :(得分:0)

怎么样:

<inReplyTo messageIdScheme="www.abc.com/msg_id">([a-zA-Z0-9:]+)</inReplyTo>