XSL根据ID属性匹配节点

时间:2019-09-03 19:08:47

标签: xml xslt xslt-2.0

我认为这是一个简单的问题,我的措词不正确,但是几个小时后我被卡住了。

我有这样的XML:

<NORMDOC>
    <DOC>
        <TXT>
        <S sid="112233-SENT-001">
            <ENAMEX type="PERSON" id="PER-112233-001">George Washington</ENAMEX> and 
            <ENAMEX type="PERSON" id="PER-112233-002">Thomas Jefferson</ENAMEX> were both founding fathers.
        </S>
        <S sid="112233-SENT-002">
            <ENAMEX type="PERSON" id="PER-112233-002">Thomas Jefferson</ENAMEX> 
            has a social security number of <IDEX type="SSN" id="SSN-112233-075">222-22-2222</IDEX>.
        </S>
      </TXT>
   </DOC>
   <ENTINFO ID="PER-112233-002"
            TYPE="PERSON"
            NORM="Jefferson, Thomas"
            REFID="PER-112233-002"
            MENTION="Thomas Jefferson"
            GIVEN="Thomas"
            MIDDLE=""
            SURNAME="Jefferson"/>
</NORMDOC>

我正在尝试通过匹配ID和id属性来合并ENTINFO和S标签的内容。

所需的输出:

<ENTINFO>
    <ENTINFO_PERSON_NORM>Jefferson, Thomas</ENTINFO_PERSON_NORM>
    <ENTINFO_PERSON_MENTION>Thomas Jefferson</ENTINFO_PERSON_MENTION>
    <ENTINFO_PERSON_GIVEN>Thomas</ENTINFO_PERSON_GIVEN>
    <ENTINFO_PERSON_MIDDLE/>
    <ENTINFO_PERSON_SURNAME>Jefferson</ENTINFO_PERSON_SURNAME>
    <ENTINFO_SSN_NORM>222222222</ENTINFO_SSN_NORM>
    <ENTINFO_SSN_MENTION>social security number of 222-22-2222</ENTINFO_SSN_MENTION>
</ENTINFO>

我遇到困难的部分是引用S元素的ID,将其用作比较,并在匹配时从S元素中提取数据。

这是我的XSLT:

<xsl:template match="ENTINFO">
    <xsl:copy>
        <!-- For each ENTINFO attribute, create a new ENTINFO element and append the attribute --> 
        <!-- name to the end of the element name ie ENTINFO ID=myid becomes <ENTINFO_ID>myid</ENTINFO_ID> -->
        <xsl:for-each select="@*">
            <xsl:element name="ENTINFO_{translate(name(), '-', '_')}">
                <xsl:value-of select="." />
            </xsl:element>
        </xsl:for-each>
        <!-- This code does not match anything so Mr. Jefferson's SSN never gets pulled in -->
        <xsl:if test="NORMDOC/DOC/TXT/S/IDEX[@id]=@ID">
            <xsl:for-each select="NORMDOC/DOC/TXT/S[@*]">
                <xsl:element name="ENTINFO_{translate(name(), '-', '_')}">
                    <xsl:value-of select="." />
                </xsl:element>
            </xsl:for-each>
        </xsl:if>
    </xsl:copy>
</xsl:template>

第一段代码正常工作,我得到了想要的附加ENTINFO标记,但是SSN没有被匹配并从IDEX元素中正确提取。第二段代码无效。

这是实际输出(我只关心ENTINFO,稍后将处理其他输出:

<NORMDOC>
   <DOC>
      <RAW_TXT>George Washington and Thomas Jefferson were both founding fathers.Thomas Jefferson has a social security number of 222-22-2222.</RAW_TXT>
      <TXT>
         <S>
            <ENAMEX_PERSON>George Washington</ENAMEX_PERSON>
            <ENAMEX_PERSON>Thomas Jefferson</ENAMEX_PERSON>
         </S>
         <S>
            <ENAMEX_PERSON>Thomas Jefferson</ENAMEX_PERSON>
            <IDEX_SSN>222-22-2222</IDEX_SSN>
         </S>
      </TXT>
   </DOC>
   <ENTITIES>
      <ENTINFO>
         <ENTINFO_ID>PER-112233-002</ENTINFO_ID>
         <ENTINFO_TYPE>PERSON</ENTINFO_TYPE>
         <ENTINFO_NORM>Jefferson, Thomas</ENTINFO_NORM>
         <ENTINFO_REFID>PER-112233-002</ENTINFO_REFID>
         <ENTINFO_MENTION>Thomas Jefferson</ENTINFO_MENTION>
         <ENTINFO_GIVEN>Thomas</ENTINFO_GIVEN>
         <ENTINFO_MIDDLE/>
         <ENTINFO_SURNAME>Jefferson</ENTINFO_SURNAME>
      </ENTINFO>
   </ENTITIES>
</NORMDOC>

1 个答案:

答案 0 :(得分:1)

最好使用 key 解决交叉引用。我无法理解预期输出的逻辑-看看附件是否可以帮助您入门:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="k" match="S" use="ENAMEX/@id" />

<xsl:template match="/NORMDOC">
    <root>
        <xsl:for-each select="ENTINFO">
            <xsl:copy>
                <!-- attributes to elements -->
                <xsl:for-each select="@*">
                    <xsl:element name="ENTINFO_{translate(name(), '-', '_')}">
                        <xsl:value-of select="." />
                    </xsl:element>
                </xsl:for-each>
                <!-- mentions by ID -->
                <xsl:for-each select="key('k', @ID)">
                    <ENTINFO_SSN_MENTION>
                        <xsl:value-of select="." />
                    </ENTINFO_SSN_MENTION>
                </xsl:for-each>
            </xsl:copy>
        </xsl:for-each>
    </root>
</xsl:template>

</xsl:stylesheet>

演示https://xsltfiddle.liberty-development.net/3NSSEux