这是从Attaching ancestor attributes to child nodes
链接的我从大型xml数据集中提取名称,我需要提取displayname和其他名称类型(目前我只选择同义词和SystematicNames)。现在,在我到目前为止已经得到了一个很棒的人的帮助下,但它只提取了每种类型的第一个......
示例XML
<Chemical id="0000103902" displayFormula="C8-H9-N-O2" displayName="Acetaminophen [USP:JAN]">
<NameList>
<DescriptorName>Acetaminophen<SourceList><Source>MeSH</Source></SourceList></DescriptorName>
<NameOfSubstance>Acetaminophen<SourceList><Source>HSDB</Source><Source>MeSH</Source></SourceList></NameOfSubstance>
<NameOfSubstance>Acetaminophen [USP:JAN]<SourceList><Source>NLM</Source></SourceList></NameOfSubstance>
<MixtureName>Actifed Plus<SourceList><Source>MeSH</Source></SourceList></MixtureName>
<MixtureName>Jin Gang<SourceList><Source>NLM</Source></SourceList></MixtureName>
<MixtureName>Talacen<SourceList><Source>NLM</Source></SourceList></MixtureName>
<SystematicName>Acetamide, N-(4-hydroxyphenyl)-<SourceList><Source>EPA SRS</Source><Source>MeSH</Source><Source>TSCAINV</Source></SourceList></SystematicName>
<SystematicName>Acetaminophen<SourceList><Source>CCRIS</Source></SourceList></SystematicName>
<SystematicName>Acetanilide, 4'-hydroxy-<SourceList><Source>RTECS</Source></SourceList></SystematicName>
<SystematicName>Paracetamol<SourceList><Source>ECHA</Source><Source>EINECS</Source></SourceList></SystematicName>
<Synonyms>4-13-00-01091 (Beilstein Handbook Reference)<SourceList><Source>RTECS</Source></SourceList></Synonyms>
<Synonyms>Abensanil<SourceList><Source>HSDB</Source><Source>RTECS</Source></SourceList></Synonyms>
<Synonyms>Acetagesic<SourceList><Source>HSDB</Source><Source>RTECS</Source></SourceList></Synonyms>
<Synonyms>Acetamide, N-(p-hydroxyphenyl)-<SourceList><Source>RTECS</Source></SourceList></Synonyms>
</NameList>
</Chemical>
当前代码
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:variable name="FS">
<!-- Field seperator -->
<xsl:text>;</xsl:text>
</xsl:variable>
<xsl:variable name="LT">
<!-- Line terminator -->
<xsl:text> </xsl:text>
</xsl:variable>
<xsl:strip-space elements="*" />
<xsl:template match="/">
<xsl:for-each select="//Chemical[@displayName != '' and @displayName != 'INDEX NAME NOT YET ASSIGNED']">
<xsl:call-template name="printValues">
<xsl:with-param name="val1" select="@id" />
<xsl:with-param name="val2" select="@displayName" />
</xsl:call-template>
<xsl:if test="normalize-space(NameList/SystematicName/text()) != ''">
<xsl:call-template name="printValues">
<xsl:with-param name="val1" select="@id" />
<xsl:with-param name="val2" select="normalize-space(NameList/SystematicName/text())" />
</xsl:call-template>
</xsl:if>
<xsl:if test="normalize-space(NameList/Synonyms/text()) != ''">
<xsl:call-template name="printValues">
<xsl:with-param name="val1" select="@id" />
<xsl:with-param name="val2" select="normalize-space(NameList/Synonyms/text())" />
</xsl:call-template>
</xsl:if>
</xsl:for-each>
</xsl:template>
<xsl:template name="printValues">
<xsl:param name="val1" />
<xsl:param name="val2" />
<!-- constants -->
<xsl:variable name="url" select="'https://chem.nlm.nih.gov/chemidplus/sid/startswith/'" />
<xsl:variable name="src" select="'nlm'" />
<xsl:text>"</xsl:text>
<xsl:call-template name="escapeQuote">
<xsl:with-param name="paramStr" select="$val2" />
</xsl:call-template>
<xsl:text>"</xsl:text>
<xsl:text>,</xsl:text>
<xsl:text>"</xsl:text>
<xsl:value-of select="concat($url, $val1)" />
<xsl:text>"</xsl:text>
<xsl:text>,</xsl:text>
<xsl:text>"</xsl:text>
<xsl:value-of select="$src" />
<xsl:text>"</xsl:text>
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template name="escapeQuote">
<xsl:param name="paramStr" />
<xsl:if test="string-length($paramStr) > 0">
<xsl:value-of select="substring-before(concat($paramStr, '"'), '"')" />
<xsl:if test="contains($paramStr, '"')">
<xsl:text>\"</xsl:text>
<xsl:call-template name="escapeQuote">
<xsl:with-param name="paramStr" select="substring-after($paramStr, '"')" />
</xsl:call-template>
</xsl:if>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
然而,这仅给出: -
"Acetaminophen [USP:JAN]","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Acetamide, N-(4-hydroxyphenyl)-","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"4-13-00-01091 (Beilstein Handbook Reference)","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
如何以同样的方式提取所有孩子?
答案 0 :(得分:0)
通过为这些元素添加<SystematicName>
循环,可以在<Synonyms>
和<xsl:for-each>
节点中打印文本值。 <xsl:if>
条件也可以在<xsl:for-each>
选择中处理。
请修改<xsl:template match="/">
,如下所示。
<xsl:template match="Chemical[@displayName != '' and @displayName != 'INDEX NAME NOT YET ASSIGNED']">
<xsl:variable name="idValue" select="@id" />
<xsl:call-template name="printValues">
<xsl:with-param name="val1" select="$idValue" />
<xsl:with-param name="val2" select="@displayName" />
</xsl:call-template>
<xsl:for-each select="NameList/SystematicName[text() != '']">
<xsl:call-template name="printValues">
<xsl:with-param name="val1" select="$idValue" />
<xsl:with-param name="val2" select="normalize-space(text())" />
</xsl:call-template>
</xsl:for-each>
<xsl:for-each select="NameList/Synonyms[text() != '']">
<xsl:call-template name="printValues">
<xsl:with-param name="val1" select="$idValue" />
<xsl:with-param name="val2" select="normalize-space(text())" />
</xsl:call-template>
</xsl:for-each>
</xsl:template>
输出
"Acetaminophen [USP:JAN]","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Acetamide, N-(4-hydroxyphenyl)-","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Acetaminophen","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Acetanilide, 4'-hydroxy-","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Paracetamol","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"4-13-00-01091 (Beilstein Handbook Reference)","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Abensanil","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Acetagesic","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"
"Acetamide, N-(p-hydroxyphenyl)-","https://chem.nlm.nih.gov/chemidplus/sid/startswith/0000103902","nlm"