我有以下要使用R解析的XML文件.XML具有深层结构,并且还有不同数量的子节点。
<?xml version="1.0" encoding="UTF-8"?>
<Alert date="20161223_2" type="full">
<Records>
<Person Id="100">
<PersonNameDetails>
<PersonNames id="Name1">
<ReferenceGroup ReferenceGroupCode="ABC"/>
<ReferenceGroup ReferenceGroupCode="DEF"/>
<PersonNameValue>
<FirstName>Carl Bangouvounda</FirstName>
<Surname>Toziz</Surname>
</PersonNameValue>
</PersonNames>
<PersonNames id="Name2">
<ReferenceGroup ReferenceGroupCode="ABC"/>
<ReferenceGroup ReferenceGroupCode="GHI" ReferenceGroupLanguageCode="en"/>
<ReferenceGroup ReferenceGroupCode="JKL"/>
<ReferenceGroup ReferenceGroupCode="MNO"/>
<ReferenceGroup ReferenceGroupCode="DEF"/>
<PersonNameValue>
<FirstName>Tozize</FirstName>
<Surname>Bangouvonda</Surname>
</PersonNameValue>
</PersonNames>
<PersonNames id="Name3">
<ReferenceGroup ReferenceGroupCode="MNO"/>
<PersonNameValue>
<FirstName>Carol</FirstName>
<Surname>Tozize</Surname>
</PersonNameValue>
</PersonNames>
<PersonNames id="Name4">
<ReferenceGroup ReferenceGroupCode="PQR"/>
<ReferenceGroup ReferenceGroupCode="MNO"/>
<PersonNameValue>
<FirstName>Carol</FirstName>
<MiddleName>Bangouvonda</MiddleName>
<Surname>Tozize</Surname>
</PersonNameValue>
</PersonNames>
<PersonNames id="Name5">
<ReferenceGroup ReferenceGroupCode="GHI" ReferenceGroupLanguageCode="en"/>
<ReferenceGroup ReferenceGroupCode="JKL"/>
<ReferenceGroup ReferenceGroupCode="DEF"/>
<PersonNameValue>
<FirstName>Carl Bangouvonda</FirstName>
<Surname>Toziz</Surname>
</PersonNameValue>
</PersonNames>
</PersonNameDetails>
</Person>
</Records>
</Alert>
预期输出如下:
-----------------------------------------------------------
Id | id | ReferenceGroup | FirstName | MiddleName | Surname
-----------------------------------------------------------
100 | Name1 | ABC, DEF | Carl Bangouvounda | NA | Toziz
-----------------------------------------------------------
100 | Name2 | ABC, GHI, JKL, MNO, DEF | Tozize | NA | Bangouvonda
-----------------------------------------------------------
100 | Name3 | MNO | Carol | NA | Tozize
-----------------------------------------------------------
100 | Name4 | PQR, MNO | Carol | Bangouvonda | Tozize
-----------------------------------------------------------
100 | Name5 | GHI, JKL, DEF | Carl Bangouvonda | NA | Toziz
-----------------------------------------------------------
Id来自元素Person的属性,而所有其他属性来自PersonNameDetails。我还想将ReferenceGroupCode连接到同一Personnames元素中的一个字符串中。
我按照以下代码将建议转换为XSLT:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/Alert ">
<xsl:copy>
<xsl:apply-templates select="Records"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Records">
<xsl:apply-templates select="Person"/>
</xsl:template>
<xsl:template match="Person">
<xsl:apply-templates select="PersonNameDetails"/>
</xsl:template>
<xsl:template match="PersonNameDetails">
<xsl:apply-templates select="PersonNames"/>
</xsl:template>
<xsl:template match="PersonNames">
<xsl:apply-templates select="PersonNameValue"/>
</xsl:template>
<xsl:template match="PersonNameValue">
<PersonNameValue>
<Id><xsl:value-of select="ancestor::Person/@Id"/></Id>
<id><xsl:value-of select="ancestor::PersonNames/@id"/></id>
<xsl:copy-of select="FirstName"/>
<MiddleName><xsl:value-of select="MiddleName"/></MiddleName>
<Surname><xsl:value-of select="Surname"/></Surname>
<ReferenceGroupCode><xsl:value-of select="ancestor::PersonNames/ReferenceGroup/@ReferenceGroupCode"/></ReferenceGroupCode>
</PersonNameValue>
</xsl:template>
</xsl:transform>
如何更改XSLT代码,以便ReferenceGroup输出
<ReferenceGroupCode>ABC,DEF</ReferenceGroupCode>
非常感谢任何帮助。
答案 0 :(得分:0)
不确定XSLT,但您可以在PersonNames节点上使用xpath并编写一个函数来处理缺失值或多个值。
doc <- xmlParse( "<your XML file>")
x <- getNodeSet(doc, "//PersonNames")
xpath2 <-function(x, ...){
y <- xpathSApply(x, ...)
ifelse(length(y) == 0, NA, paste(y, collapse=", "))
}
y <- data.frame(
id = sapply(x, xpath2, ".", xmlGetAttr, "id"),
ReferenceGroup= sapply(x, xpath2, ".//ReferenceGroup", xmlGetAttr, "ReferenceGroupCode"),
FirstName = sapply(x, xpath2, ".//FirstName", xmlValue),
MiddleName = sapply(x, xpath2, ".//MiddleName", xmlValue),
Surname = sapply(x, xpath2, ".//Surname", xmlValue)
)
id ReferenceGroup FirstName MiddleName Surname
1 Name1 ABC, DEF Carl Bangouvounda <NA> Toziz
2 Name2 ABC, GHI, JKL, MNO, DEF Tozize <NA> Bangouvonda
3 Name3 MNO Carol <NA> Tozize
4 Name4 PQR, MNO Carol Bangouvonda Tozize
5 Name5 GHI, JKL, DEF Carl Bangouvonda <NA> Toziz
也许可以通过计算PersonName节点的数量来添加Person Id?
n <- xpathSApply(doc, "//Person/PersonNameDetails", xmlSize)
y$ID <- rep( xpathSApply(doc, "//Person", xmlGetAttr, "Id"), n)