我认为这是一个简单的问题,我的措词不正确,但是几个小时后我被卡住了。
我有这样的XML:
<NORMDOC>
<DOC>
<TXT>
<S sid="112233-SENT-001">
<ENAMEX type="PERSON" id="PER-112233-001">George Washington</ENAMEX> and
<ENAMEX type="PERSON" id="PER-112233-002">Thomas Jefferson</ENAMEX> were both founding fathers.
</S>
<S sid="112233-SENT-002">
<ENAMEX type="PERSON" id="PER-112233-002">Thomas Jefferson</ENAMEX>
has a social security number of <IDEX type="SSN" id="SSN-112233-075">222-22-2222</IDEX>.
</S>
</TXT>
</DOC>
<ENTINFO ID="PER-112233-002"
TYPE="PERSON"
NORM="Jefferson, Thomas"
REFID="PER-112233-002"
MENTION="Thomas Jefferson"
GIVEN="Thomas"
MIDDLE=""
SURNAME="Jefferson"/>
</NORMDOC>
我正在尝试通过匹配ID和id属性来合并ENTINFO和S标签的内容。
所需的输出:
<ENTINFO>
<ENTINFO_PERSON_NORM>Jefferson, Thomas</ENTINFO_PERSON_NORM>
<ENTINFO_PERSON_MENTION>Thomas Jefferson</ENTINFO_PERSON_MENTION>
<ENTINFO_PERSON_GIVEN>Thomas</ENTINFO_PERSON_GIVEN>
<ENTINFO_PERSON_MIDDLE/>
<ENTINFO_PERSON_SURNAME>Jefferson</ENTINFO_PERSON_SURNAME>
<ENTINFO_SSN_NORM>222222222</ENTINFO_SSN_NORM>
<ENTINFO_SSN_MENTION>social security number of 222-22-2222</ENTINFO_SSN_MENTION>
</ENTINFO>
我遇到困难的部分是引用S元素的ID,将其用作比较,并在匹配时从S元素中提取数据。
这是我的XSLT:
<xsl:template match="ENTINFO">
<xsl:copy>
<!-- For each ENTINFO attribute, create a new ENTINFO element and append the attribute -->
<!-- name to the end of the element name ie ENTINFO ID=myid becomes <ENTINFO_ID>myid</ENTINFO_ID> -->
<xsl:for-each select="@*">
<xsl:element name="ENTINFO_{translate(name(), '-', '_')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
<!-- This code does not match anything so Mr. Jefferson's SSN never gets pulled in -->
<xsl:if test="NORMDOC/DOC/TXT/S/IDEX[@id]=@ID">
<xsl:for-each select="NORMDOC/DOC/TXT/S[@*]">
<xsl:element name="ENTINFO_{translate(name(), '-', '_')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
</xsl:if>
</xsl:copy>
</xsl:template>
第一段代码正常工作,我得到了想要的附加ENTINFO标记,但是SSN没有被匹配并从IDEX元素中正确提取。第二段代码无效。
这是实际输出(我只关心ENTINFO,稍后将处理其他输出:
<NORMDOC>
<DOC>
<RAW_TXT>George Washington and Thomas Jefferson were both founding fathers.Thomas Jefferson has a social security number of 222-22-2222.</RAW_TXT>
<TXT>
<S>
<ENAMEX_PERSON>George Washington</ENAMEX_PERSON>
<ENAMEX_PERSON>Thomas Jefferson</ENAMEX_PERSON>
</S>
<S>
<ENAMEX_PERSON>Thomas Jefferson</ENAMEX_PERSON>
<IDEX_SSN>222-22-2222</IDEX_SSN>
</S>
</TXT>
</DOC>
<ENTITIES>
<ENTINFO>
<ENTINFO_ID>PER-112233-002</ENTINFO_ID>
<ENTINFO_TYPE>PERSON</ENTINFO_TYPE>
<ENTINFO_NORM>Jefferson, Thomas</ENTINFO_NORM>
<ENTINFO_REFID>PER-112233-002</ENTINFO_REFID>
<ENTINFO_MENTION>Thomas Jefferson</ENTINFO_MENTION>
<ENTINFO_GIVEN>Thomas</ENTINFO_GIVEN>
<ENTINFO_MIDDLE/>
<ENTINFO_SURNAME>Jefferson</ENTINFO_SURNAME>
</ENTINFO>
</ENTITIES>
</NORMDOC>
答案 0 :(得分:1)
最好使用 key 解决交叉引用。我无法理解预期输出的逻辑-看看附件是否可以帮助您入门:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="k" match="S" use="ENAMEX/@id" />
<xsl:template match="/NORMDOC">
<root>
<xsl:for-each select="ENTINFO">
<xsl:copy>
<!-- attributes to elements -->
<xsl:for-each select="@*">
<xsl:element name="ENTINFO_{translate(name(), '-', '_')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
<!-- mentions by ID -->
<xsl:for-each select="key('k', @ID)">
<ENTINFO_SSN_MENTION>
<xsl:value-of select="." />
</ENTINFO_SSN_MENTION>
</xsl:for-each>
</xsl:copy>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>