我有以下XML:
<?xml version="1.0" encoding="utf-8"?>
<NewDataSet>
<GUID>
<Active>true</Active>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<DateOfBirth>16/01/1988</DateOfBirth>
<FirstName>Fred</FirstName>
<Notes>some notes</Notes>
<PlaceOfResidence>United Kingdom</PlaceOfResidence>
<RowNumber>1</RowNumber>
<TableName>PersonDetails</TableName>
</GUID>
<GUID>
<Active>true</Active>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<DateOfBirth>01/01/1960</DateOfBirth>
<FirstName>Harold</FirstName>
<Notes>some notes</Notes>
<PlaceOfResidence>United Kingdom</PlaceOfResidence>
<RowNumber>2</RowNumber>
<TableName>PersonDetails</TableName>
</GUID>
<GUID>
<Active>true</Active>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<DateOfBirth>05/05/1955</DateOfBirth>
<FirstName>Mary</FirstName>
<Notes>some notes</Notes>
<PlaceOfResidence>United States</PlaceOfResidence>
<RowNumber>3</RowNumber>
<TableName>PersonDetails</TableName>
</GUID>
<GUID>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<CoverType>Property</CoverType>
<DateAdded>01/06/2017</DateAdded>
<Notes>some notes</Notes>
<RowNumber>1</RowNumber>
<TableName>Covers</TableName>
</GUID>
<GUID>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<CoverType>Motor</CoverType>
<DateAdded>01/06/2017</DateAdded>
<Notes>some notes</Notes>
<RowNumber>2</RowNumber>
<TableName>Covers</TableName>
</GUID>
<GUID>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<CoverType>Liability</CoverType>
<DateAdded>01/06/2017</DateAdded>
<Notes>some notes</Notes>
<RowNumber>3</RowNumber>
<TableName>Covers</TableName>
</GUID>
</NewDataSet>
我需要将其转换为以下内容:
<data>
<ContractName>Contract Name</ContractName>
<ContractNumber>Auto</ContractNumber>
<Table>
<TableRow RowNumber="1" TableName="PersonDetails">
<FirstName>Fred</FirstName>
<PlaceOfResidence>United Kingdom</PlaceOfResidence>
<DateOfBirth>16/01/1988</DateOfBirth>
<Active>true</Active>
</TableRow>
<TableRow RowNumber="2" TableName="PersonDetails">
<FirstName>Harold</FirstName>
<PlaceOfResidence>United Kingdom</PlaceOfResidence>
<DateOfBirth>01/01/1960</DateOfBirth>
<Active>true</Active>
</TableRow>
<TableRow RowNumber="3" TableName="PersonDetails">
<FirstName>Mary</FirstName>
<PlaceOfResidence>United States</PlaceOfResidence>
<DateOfBirth>05/05/1955</DateOfBirth>
<Active>true</Active>
</TableRow>
</Table>
<Table>
<TableRow RowNumber="1" TableName="Covers">
<CoverType>Property</CoverType>
<DateAdded>01/06/2017</DateAdded>
</TableRow>
<TableRow RowNumber="2" TableName="Covers">
<CoverType>Motor</CoverType>
<DateAdded>01/06/2017</DateAdded>
</TableRow>
<TableRow RowNumber="3" TableName="Covers">
<CoverType>Liability</CoverType>
<DateAdded>01/06/2017</DateAdded>
</TableRow>
</Table>
<Notes>some notes</Notes>
</data>
我只能使用XSLT 1.0。
到目前为止,我有:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-16"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[(*)]">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="/">
<data>
<xsl:apply-templates select="@* | node()"/>
</data>
</xsl:template>
</xsl:stylesheet>
删除<NewDataSet>
和<GUID>
代码并替换为<data>
。
但是我不确定如何生成2个表分组,并且还自动*识别重复值:ContractName,Contract Number和Notes。
*其他重复值可能会在稍后出现。
任何帮助或指示将不胜感激。
答案 0 :(得分:0)
这里面临的部分挑战是,您有GUID
个元素的平面列表,您需要按TableName
值进行分组。这似乎是Muenchian分组的一个非常经典的选择,这是一种在Jenny Tennison的网站上详细描述的技术:http://www.jenitennison.com/xslt/grouping/muenchian.html
以下是一些XSL代码,它产生的输出几乎与您所需的XML格式相同 - 唯一的区别在于<TableRow>
中元素的顺序。然而,这种转变有许多不明确的方面。我在代码评论中指出了这些问题。
<xsl:output method="xml" version="1.0" encoding="utf-16" indent="yes"/>
<!-- Avoids newline whitespace where source elements are excluded from output -->
<xsl:strip-space elements="*"/>
<!-- The input XML appears to be a flat-ish list of `GUID` elements,
each of which represents a single row of data in any of various tables.
In reorganizing this data, we want to group by `GUID/TableName` values.
Muenchian grouping is probably the best way to do this in XSLT 1.0:
see more at http://www.jenitennison.com/xslt/grouping/muenchian.html.
This requires a key, so before we try to process the `GUID`s, we set
up the key. -->
<xsl:key name="table" match="GUID" use="TableName"/>
<!-- Begin at the beginning: the root and topmost element. -->
<xsl:template match="/NewDataSet">
<data>
<!-- I see in your desired output XML that you've put
`ContractName` and `ContractNumber` just under the
top-level `data` element. This appears to assume
that ALL of these have the same value for ALL of
the individual `GUID` recordsets.
In your input XML, these two are elements under `GUID`,
so these data fields are included in the individual data
records. NOTE: If there is *any* chance that the values
in these fields might differ between records, these
should be kept within the table rows, and *not* moved
to the same level as the output `Table` elements. -->
<!-- This just naively copies these two elements from the first
`GUID` that has them.
Again, if these values have any chance of differing between
`GUID`s, this whole approach is flawed. -->
<xsl:copy-of select="GUID[ContractName][1]/ContractName"/>
<xsl:copy-of select="GUID[ContractNumber][1]/ContractNumber"/>
<!-- We want to process `GUID`s after grouping by `TableName` values.
This `for-each` is part of the Muenchian grouping technique. See
Jenny Tennison's page (linked above) for a detailed explanation. -->
<xsl:for-each select="GUID[count(. | key('table', TableName)[1]) = 1]">
<!-- If you wanted to sort alphabetically by TableName, you'd use:
<xsl:sort select="TableName" /> -->
<Table>
<!-- Now, within each `table`, we want to process all those
`GUID`s with this same corresponding `TableName`. -->
<xsl:for-each select="key('table', TableName)">
<!-- We select "this", since we want to process the matching
`GUID`, not just its children. -->
<xsl:apply-templates select="."/>
</xsl:for-each>
</Table>
</xsl:for-each>
<!-- This just copies the `Notes` element from the first `GUID` that has a
`Notes` child.
Similar to `ContactName` and `ContactNumber`, this naively assumes that
all `Notes` elements have identical content. This approach is flawed if
there is *any* possibility of different values. -->
<xsl:copy-of select="GUID[Notes][1]/Notes"/>
</data>
</xsl:template>
<!-- List up the elements we don't want to copy verbatim into each table row -->
<xsl:variable name="nocopy">
<item>ContractName</item>
<item>ContractNumber</item>
<item>Notes</item>
<item>RowNumber</item>
<item>TableName</item>
</xsl:variable>
<xsl:template match="GUID">
<TableRow RowNumber="{RowNumber}" TableName="{TableName}">
<!-- Copy over child data, but _only_ if it's not in `$nocopy` -->
<xsl:copy-of select="*[not(name() = $nocopy/item)]"/>
</TableRow>
</xsl:template>
重新阅读帖子中的文字(而不仅仅是你的代码:)),我看到你在询问如何识别重复的元素。但是,您所需的XML输出似乎已假设所有ContractName
结构中的ContractNumber
,Notes
和GUID
元素必须相同。
这令人困惑。您想要的输出已经假定您的问题的答案。
您是否要求,&#34;如何识别所有GUID
结构共有的GUID
子元素,并创建单个顶级副本这些,同时从输出GUID
结构中删除它们?&#34;
在XSL中很容易确定给定元素是否存在于一组XML结构中的任何位置。
确定给定元素是否存在于一组XML结构的每个中并不容易。然而,虽然丑陋,但它是可能的。 :)
用以下内容替换上面的XSL。
<xsl:output method="xml" version="1.0" encoding="utf-16" indent="yes"/>
<!-- Avoids newline whitespace where source elements are excluded from output -->
<xsl:strip-space elements="*"/>
<!-- The input XML appears to be a flat-ish list of `GUID` elements,
each of which represents a single row of data in any of various tables.
In reorganizing this data, we want to group by `GUID/TableName` values.
Muenchian grouping is probably the best way to do this in XSLT 1.0:
see more at http://www.jenitennison.com/xslt/grouping/muenchian.html.
This requires a key, so before we try to process the `GUID`s, we set
up the key. -->
<xsl:key name="table" match="GUID" use="TableName"/>
此$kids
变量是我们确定哪些GUID
子项对所有GUID
结构都通用的关键部分。在XSL 1.0中可能有更优雅和有效的方法;在小型数据集上运行此操作需要0.8秒Oxygen XML(使用Saxon-HE 9.6.0.7处理器)。
<!-- Build list of unique GUID children that appear in all GUID structures -->
<xsl:variable name="kids">
<xsl:for-each select="/NewDataSet/GUID/*">
<xsl:variable name="this" select="."/>
<!-- Intermediate variable used to collect results of whether the
given child is in each GUID -->
<xsl:variable name="in_all">
<xsl:for-each select="/NewDataSet/GUID/*">
<xsl:if test="name($this) = name(.) and $this = .">
<result><xsl:value-of select="true()"/></result>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<!-- If we have the same number of `result`s as we have number of GUIDs,
then output the first of each such child (there are dupes otherwise). -->
<xsl:if test="count($in_all/result) = count(/NewDataSet/GUID) and
not(.=preceding::*)">
<xsl:copy-of select="$this"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
这个部分大致相同,除了关于$kids
的位。
<!-- Begin at the beginning: the root and topmost element. -->
<xsl:template match="/NewDataSet">
<data>
<!-- We'll put common elements at the top of the `data` structure. -->
<xsl:copy-of select="$kids"/>
<!-- We want to process `GUID`s after grouping by `TableName` values.
This `for-each` is part of the Muenchian grouping technique. See
Jenny Tennison's page (linked above) for a detailed explanation. -->
<xsl:for-each select="GUID[count(. | key('table', TableName)[1]) = 1]">
<!-- If you wanted to sort alphabetically by TableName, you'd use:
<xsl:sort select="TableName" /> -->
<Table>
<!-- Now, within each `table`, we want to process all those
`GUID`s with this same corresponding `TableName`. -->
<xsl:for-each select="key('table', TableName)">
<!-- We select "this", since we want to process the matching
`GUID`, not just it's children. -->
<xsl:apply-templates select="."/>
</xsl:for-each>
</Table>
</xsl:for-each>
</data>
</xsl:template>
更新$nocopy
以包含$kids
标识的元素名称。
<!-- List up the elements we don't want to copy verbatim into each table row -->
<xsl:variable name="nocopy">
<item>RowNumber</item>
<item>TableName</item>
<!-- Copy in the bits from $kids -->
<xsl:for-each select="$kids/*">
<item><xsl:value-of select="name(.)"/></item>
</xsl:for-each>
</xsl:variable>
<xsl:template match="GUID">
<TableRow RowNumber="{RowNumber}" TableName="{TableName}">
<!-- Copy over child data, but _only_ if it's not in `$nocopy` -->
<xsl:copy-of select="*[not(name() = $nocopy/item)]"/>
</TableRow>
</xsl:template>
现在生成的输出功能与所需的输出XML相同。唯一的区别是元素的顺序 - TableRow
子项的顺序不同,Notes
位于data
的顶部,ContractName
和{{1} }而不是ContractNumber
的底部。
将表的名称作为每个表行的属性包含在内似乎有点奇怪。将它作为data
元素本身的属性更有意义。
同样,在每一行上都有Table
属性似乎是多余的。只需查看其父RowNumber
中每个position()
的{{1}}即可收集此信息。
那就是说,你知道你的要求。这只是让事情正常运转的问题。 :)