我需要转置一个XML文件,例如:
<?xml version="1.0" encoding="UTF-8"?>
<products>
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC1" manufacturer="Some Company 1" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC1" manufacturer="Some Company 1" />
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC2" manufacturer="Some Company 2" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC2" manufacturer="Some Company 2" />
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC3" manufacturer="Some Company 3" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC3" manufacturer="Some Company 3" />
<product id="11" name="Widget 2" price="21.99" category_code="V" category="Video Games" manufacturer_code="SC4" manufacturer="Some Company 4" />
<product id="12" name="Widget 3" price="10.99" category_code="T" category="Toys" manufacturer_code="SC1" manufacturer="Some Company 2" />
</products>
以逗号分隔的文本文件或正确格式化的HTML表格,每个产品只包含一行,如下所示:
id,name,price,category_code_1,category_1,category_code_2,category_2,manufacturer_code_1,manufacturer_1,manufacturer_code_2,manufacturer_2,manufacturer_code_3,manufacturer_3
10, Widget 1, 15.99, T, Toys, E, Electronics, SC1, Some Company 1, SC2, Some Company 2, SC3, Some Company 3
11, Widget 2, 21.99, V, Video Games,,, SC4, Some Company 4,,,,
12, Widget 3, 10.99, T, Toys,,, SC1, Some Company 2,,,,
正如您所注意到的,XML数据可以被认为是三个表连接在一起的结果:product,product_category和product_manufacturer。每个产品可以属于多个类别,并且有多个制造商。当然,我正在处理的真实数据更复杂,并且在一个完全不同的领域,但是这个样本正确描述了问题。
我对XSLT的了解非常有限,并且在Interent的SO和其他资源的帮助下,已经整理了一个样式表,部分提供了我需要的东西:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="key_product_group" match="product" use="@id"/>
<xsl:key name="key_category_group" match="product" use="concat(
@id,
@category_code,
@category)"/>
<xsl:key name="key_manufacturer_group" match="product" use="concat(
@id,
@manufacturer_code,
@manufacturer)"/>
<xsl:variable name="var_max_category_group" >
<xsl:for-each select="//product[generate-id(.) = generate-id(key('key_product_group',@id) )]">
<xsl:sort select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])" order="descending"/>
<xsl:if test="position() = 1">
<xsl:value-of select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="var_max_manufacturer_group">
<xsl:for-each select="//product[generate-id(.) = generate-id(key('key_product_group',@id) )]">
<xsl:sort select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@category_code,
@category)))])" order="descending"/>
<xsl:if test="position() = 1">
<xsl:value-of select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))])"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/">
<xsl:text>id,</xsl:text>
<xsl:text>name,</xsl:text>
<xsl:text>price,</xsl:text>
<xsl:call-template name="loop_pcat">
<xsl:with-param name="count" select="$var_max_category_group"/>
</xsl:call-template>
<xsl:call-template name="loop_pmf">
<xsl:with-param name="count" select="$var_max_manufacturer_group"/>
</xsl:call-template>
<br></br>
<xsl:variable name="var_group"
select="//product[generate-id(.) = generate-id(key('key_product_group',@id)[1])]"/>
<xsl:for-each select="$var_group">
<xsl:sort order="ascending" select="@id"/>
<xsl:value-of select="@id"/>,
<xsl:value-of select="@name"/>,
<xsl:value-of select="@price"/>,
<xsl:for-each select="key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))]">
<xsl:value-of select="@category_code"/>,
<xsl:value-of select="@category"/>,
</xsl:for-each>
<xsl:for-each select="key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))]">
<xsl:value-of select="@manufacturer_code"/>,
<xsl:value-of select="@manufacturer"/>,
</xsl:for-each>
<br></br>
</xsl:for-each>
</xsl:template>
<xsl:template name="loop_pcat">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="concat('category_code_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="concat('category_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:call-template name="loop_pcat">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pmf">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="concat('manufacturer_code_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="concat('manufacturer_',$limit - $count)"/>
<xsl:text>,</xsl:text>
<xsl:call-template name="loop_pmf">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
上述样式表产生以下结果:
id,name,price,category_code_1,category_1,category_code_2,category_2,manufacturer_code_1,manufacturer_1,manufacturer_code_2,manufacturer_2,manufacturer_code_3,manufacturer_3,
10, Widget 1, 15.99, T, Toys, E, Electronics, SC1, Some Company 1, SC2, Some Company 2, SC3, Some Company 3,
11, Widget 2, 21.99, V, Video Games, SC4, Some Company 4,
12, Widget 3, 10.99, T, Toys, SC1, Some Company 2,
输出至少有一个主要问题:每一行中都不存在所有列,例如第2行和第3行缺少category_code_2,category_2,以及manufacturer_code和制造商2和3.我确定还有其他问题还有样式表,我不知道它会如何在一个相对较大的xml文件上执行,但是现在我非常感谢你帮助我使样式表生成所需的输出格式。
谢谢
MRSA
答案 0 :(得分:1)
我已经开始使用自己的实现,但随后你发布了,我能够重用一些代码来加快速度。主要区别是:
'#'
)。这是我生成的样式表及其输出:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" version="1.0" encoding="UTF-8"/>
<xsl:key name="byProduct" match="product" use="@id"/>
<xsl:key name="byProductAndCategory" match="product" use="concat(@id,'#',@category_code)"/>
<xsl:key name="byProductAndManufacturer" match="product" use="concat(@id,'#',@manufacturer_code)"/>
<xsl:variable name="delimiter" select="','"/>
<xsl:variable name="eol" select="'
'"/>
<xsl:variable name="maxManufacturers">
<xsl:for-each select="//product[count(.|key('byProduct',@id)[1])=1]">
<xsl:sort select="count(key('byProductAndCategory',concat(@id,'#',@category_code)))"/>
<xsl:if test="position()=last()">
<xsl:value-of select="count(key('byProductAndCategory',concat(@id,'#',@category_code)))"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="maxCategories">
<xsl:for-each select="//product[count(.|key('byProduct',@id)[1])=1]">
<xsl:sort select="count(key('byProductAndManufacturer',concat(@id,'#',@manufacturer_code)))"/>
<xsl:if test="position()=last()">
<xsl:value-of select="count(key('byProductAndManufacturer',concat(@id,'#',@manufacturer_code)))"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/">
<!--create the file header-->
<xsl:text>id</xsl:text>
<xsl:value-of select="$delimiter"/>
<xsl:text>name</xsl:text>
<xsl:value-of select="$delimiter"/>
<xsl:text>price</xsl:text>
<!--recursive loop for each group to add appropriate number of columns, i.e. maximum number of repetitions-->
<xsl:call-template name="loop_pcat_header">
<xsl:with-param name="count" select="$maxCategories"/>
</xsl:call-template>
<!--recursive loop for each group to add appropriate number of columns, i.e. maximum number of repetitions-->
<xsl:call-template name="loop_pmf_header">
<xsl:with-param name="count" select="$maxManufacturers"/>
</xsl:call-template>
<xsl:value-of select="$eol"/>
<xsl:apply-templates select="//product[count(.|key('byProduct',@id)[1])=1]"/>
</xsl:template>
<xsl:template match="product">
<!-- id -->
<xsl:value-of select="@id"/>
<xsl:value-of select="$delimiter"/>
<!-- name -->
<xsl:value-of select="@name"/>
<xsl:value-of select="$delimiter"/>
<!-- price -->
<xsl:value-of select="@price"/>
<!-- category stuff -->
<xsl:call-template name="loop_pcat_data">
<xsl:with-param name="data" select="key('byProductAndManufacturer',concat(@id,'#',@manufacturer_code))"/>
<xsl:with-param name="count" select="$maxCategories"/>
</xsl:call-template>
<!-- manufacturer stuff -->
<xsl:call-template name="loop_pmf_data">
<xsl:with-param name="data" select="key('byProductAndCategory',concat(@id,'#',@category_code))"/>
<xsl:with-param name="count" select="$maxManufacturers"/>
</xsl:call-template>
<xsl:value-of select="$eol"/>
</xsl:template>
<!--recursive loop templates for file header and data-->
<xsl:template name="loop_pcat_header">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="concat('category_code_',$limit - $count)"/>
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="concat('category_',$limit - $count)"/>
<xsl:call-template name="loop_pcat_header">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pmf_header">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="concat('manufacturer_code_',$limit - $count)"/>
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="concat('manufacturer_',$limit - $count)"/>
<xsl:call-template name="loop_pmf_header">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pcat_data">
<xsl:param name="data"/>
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="$data[$limit - $count]/@category_code"/>
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="$data[$limit - $count]/@category"/>
<xsl:call-template name="loop_pcat_data">
<xsl:with-param name="data" select="$data"/>
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pmf_data">
<xsl:param name="data"/>
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="$data[$limit - $count]/@manufacturer_code"/>
<xsl:value-of select="$delimiter"/>
<xsl:value-of select="$data[$limit - $count]/@manufacturer"/>
<xsl:call-template name="loop_pmf_data">
<xsl:with-param name="data" select="$data"/>
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
请求的输出:
id,name,price,category_code_1,category_1,category_code_2,category_2,manufacturer_code_1,manufacturer_1,manufacturer_code_2,manufacturer_2,manufacturer_code_3,manufacturer_3
10,Widget 1,15.99,T,Toys,E,Electronics,SC1,Some Company 1,SC2,Some Company 2,SC3,Some Company 3
11,Widget 2,21.99,V,Video Games,,,SC4,Some Company 4,,,,
12,Widget 3,10.99,T,Toys,,,SC1,Some Company 2,,,,
答案 1 :(得分:0)
所以这就是我提出的,我仍然需要在最后处理额外的逗号,还要处理特殊字符,例如属性值中的单个或双重qoutes。任何有关优化代码的帮助都非常感谢。
以下样式表
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" version="1.0" encoding="UTF-8" indent="yes"/>
<!--Generate keys, first is the product group-->
<xsl:key name="key_product_group" match="product" use="@id"/>
<!--Here is keys for groupings that are needed, note concatenation of attributes-->
<xsl:key name="key_category_group" match="product" use="concat(
@id,
@category_code,
@category)"/>
<!--another key-->
<xsl:key name="key_manufacturer_group" match="product" use="concat(
@id,
@manufacturer_code,
@manufacturer)"/>
<!--we need the maximum number of distinct groups for each product so that can format the layout properly-->
<xsl:variable name="var_max_category_group" >
<xsl:for-each select="//product[generate-id(.) = generate-id(key('key_product_group',@id) )]">
<xsl:sort select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])" order="descending"/>
<xsl:if test="position() = 1">
<xsl:value-of select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<!--maximum for the second group-->
<xsl:variable name="var_max_manufacturer_group">
<xsl:for-each select="//product[generate-id(.) = generate-id(key('key_product_group',@id) )]">
<xsl:sort select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@category_code,
@category)))])" order="descending"/>
<xsl:if test="position() = 1">
<xsl:value-of select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))])"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<!--delmiter and line break characters defined here-->
<xsl:variable name="var_delimiter" select="','"/>
<xsl:variable name="var_line_break" select="'
'"/>
<!--main procedure starts here-->
<xsl:template match="/">
<!--create the file header-->
<xsl:text>id</xsl:text>
<xsl:value-of select="$var_delimiter"/>
<xsl:text>name</xsl:text>
<xsl:value-of select="$var_delimiter"/>
<xsl:text>price</xsl:text>
<xsl:value-of select="$var_delimiter"/>
<!--recursive loop for each group to add appropriate number of columns, i.e. maximum number of repetitions-->
<xsl:call-template name="loop_pcat_header">
<xsl:with-param name="count" select="$var_max_category_group"/>
</xsl:call-template>
<!--recursive loop for each group to add appropriate number of columns, i.e. maximum number of repetitions-->
<xsl:call-template name="loop_pmf_header">
<xsl:with-param name="count" select="$var_max_manufacturer_group"/>
</xsl:call-template>
<xsl:value-of select="$var_line_break"/>
<!--select products and order ascending, the ordering is not really needed-->
<xsl:variable name="var_group"
select="//product[generate-id(.) = generate-id(key('key_product_group',@id)[1])]"/>
<xsl:for-each select="$var_group">
<xsl:sort order="ascending" select="@id"/>
<!--select non-repeatable attributes-->
<xsl:value-of select="@id"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="@name"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="@price"/>
<xsl:value-of select="$var_delimiter"/>
<!--select repeatable groups for each product-->
<xsl:for-each select="key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))]">
<xsl:value-of select="@category_code"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="@category"/>
<xsl:value-of select="$var_delimiter"/>
<!--count number of groups for each product and add blanks for the difference between the max and current instance-->
<xsl:variable name="var_max_category_group_instance" select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_category_group',
concat(@id,
@category_code,
@category)))])"/>
<xsl:call-template name="loop_pcat_data">
<xsl:with-param name="count" select="$var_max_category_group - $var_max_category_group_instance"/>
</xsl:call-template>
</xsl:for-each>
<!--select repeatable groups for each product-->
<xsl:for-each select="key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))]">
<xsl:value-of select="@manufacturer_code"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="@manufacturer"/>
<xsl:value-of select="$var_delimiter"/>
<!--count number of groups for each product and add blanks for the difference between the max and current instance-->
<xsl:variable name="var_max_manufacturer_group_instance" select="count(key('key_product_group',@id )[generate-id() = generate-id(key('key_manufacturer_group',
concat(@id,
@manufacturer_code,
@manufacturer)))])"/>
<xsl:call-template name="loop_pmf_data">
<xsl:with-param name="count" select="$var_max_manufacturer_group - $var_max_manufacturer_group_instance"/>
</xsl:call-template>
</xsl:for-each>
<xsl:value-of select="$var_line_break"/>
</xsl:for-each>
</xsl:template>
<!--recursive loop templates for file header and data-->
<xsl:template name="loop_pcat_header">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="concat('category_code_',$limit - $count)"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="concat('category_',$limit - $count)"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:call-template name="loop_pcat_header">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pcat_data">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:call-template name="loop_pcat_data">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pmf_header">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="concat('manufacturer_code_',$limit - $count)"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="concat('manufacturer_',$limit - $count)"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:call-template name="loop_pmf_header">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="loop_pmf_data">
<xsl:param name="count" select="1"/>
<xsl:param name="limit" select="$count+1"/>
<xsl:if test="$count > 0">
<xsl:value-of select="$var_delimiter"/>
<xsl:value-of select="$var_delimiter"/>
<xsl:call-template name="loop_pmf_data">
<xsl:with-param name="count" select="$count - 1"/>
<xsl:with-param name="limit" select="$limit"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
应用于此XML
<?xml version="1.0" encoding="UTF-8"?>
<products>
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC1" manufacturer="Some Company 1" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC1" manufacturer="Some Company 1" />
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC2" manufacturer="Some Company 2" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC2" manufacturer="Some Company 2" />
<product id="10" name="Widget 1" price="15.99" category_code="T" category="Toys" manufacturer_code="SC3" manufacturer="Some Company 3" />
<product id="10" name="Widget 1" price="15.99" category_code="E" category="Electronics" manufacturer_code="SC3" manufacturer="Some Company 3" />
<product id="11" name="Widget 2" price="21.99" category_code="V" category="Video Games" manufacturer_code="SC4" manufacturer="Some Company 4" />
<product id="12" name="Widget 3" price="10.99" category_code="T" category="Toys" manufacturer_code="SC1" manufacturer="Some Company 2" />
</products>
生成以下输出:
id,name,price,category_code_1,category_1,category_code_2,category_2,manufacturer_code_1,manufacturer_1,manufacturer_code_2,manufacturer_2,manufacturer_code_3,manufacturer_3,
10,Widget 1,15.99,T,Toys,E,Electronics,SC1,Some Company 1,SC2,Some Company 2,SC3,Some Company 3,
11,Widget 2,21.99,V,Video Games,,,SC4,Some Company 4,,,,,
12,Widget 3,10.99,T,Toys,,,SC1,Some Company 2,,,,,