我正在尝试使用xsltproc从OpenXML中提取表。 有人可以帮助我以下(我的愿望清单) 1.如何过滤非表格文本 2.格式化表格(可能是csv) 3.将结果导入多个输出文件(每个文件中有一个表),版本= 1.0
我的尝试:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<xsl:output method="text"/>
<xsl:template match="w:tbl">
<xsl:apply-templates/><xsl:for-each select="w:tr"><xsl:text>
</xsl:text>
</xsl:for-each>
<xsl:apply-templates/><xsl:for-each select="w:tcW"><xsl:text> ;</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
这会产生一个输出:
> xsltproc new2 word/document.xml
Node Selection and Pattern MatchingIn XSLT stylesheets, template rules for node selection and pattern matching are applied via the select attribute of the xsl:apply-templates command and the match attribute of the xsl:template element, respectively. A specification can be created to determine how to resolve issues in the event that a multiple number of applicable template rules exist, or alternately, when there are no applicable template rules at all.Table1 headingTextbodyBlahblahBody2Blah2Blah2
Table1 headingTextbodyBlahblahBody2Blah2Blah2Node SelectionWith the select attribute of xsl:apply-templates command, an XPath description can be used to either (1) select a multiple number of nodes with identical names, or (2) select a multiple number of nodes with differing names. Under scenario (1), using XPath to designate "ProductList/ Product" results in the selection of two Product element nodes.Table1 headingCol1TextbodyBlahCol1 BlahblahBody2Blah2Col1 Blah2Blah2Body3Blah3Col1 Blah3Blah3
Table1 headingCol1TextbodyBlahCol1 BlahblahBody2Blah2Col1 Blah2Blah2Body3Blah3Col1 Blah3Blah3
预期产出:
Table1 ;heading ;Text
body ;Blah ;blah
Body2 ;Blah2 ;Blah2
Table1 ;heading ;Col1 ;Text
body ;Blah ;Col1 Blah ;blah
Body2 ;Blah2 ;Col1 Blah2 ;Blah2
Body3 ;Blah3 ;Col1 Blah3 ;Blah3
我已成功完成的原始尝试形式如下,但未达到上述2/3的目标。
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<xsl:output method="text"/>
<xsl:template match="w:tr">
<xsl:apply-templates/><xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="w:tcW">
<xsl:apply-templates/><xsl:text> ;</xsl:text>
</xsl:template>
<xsl:template match="w:p">
<xsl:apply-templates/><xsl:if test="position()!=last()"><xsl:text>
</xsl:text></xsl:if>
</xsl:template>
</xsl:stylesheet>
我的输入XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 wp14"><w:body><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>Node Selection and Pattern Matching</w:t></w:r></w:p><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>In XSLT stylesheets, template rules for node selection and pattern matching are applied via the select attribute of the xsl:apply-templates command and the match attribute of the xsl:template element, respectively. A specification can be created to determine how to resolve issues in the event that a multiple number of applicable template rules exist, or alternately, when there are no applicable template rules at all.</w:t></w:r></w:p><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"/><w:tbl><w:tblPr><w:tblStyle w:val="TableGrid"/><w:tblW w:w="0" w:type="auto"/><w:tblLook w:val="04A0" w:firstRow="1" w:lastRow="0" w:firstColumn="1" w:lastColumn="0" w:noHBand="0" w:noVBand="1"/></w:tblPr><w:tblGrid><w:gridCol w:w="3116"/><w:gridCol w:w="3117"/><w:gridCol w:w="3117"/></w:tblGrid><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="3116" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t xml:space="preserve">Table1 </w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="3117" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>heading</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="3117" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>Text</w:t></w:r></w:p></w:tc></w:tr><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="3116" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>body</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="3117" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>Blah</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="3117" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>blah</w:t></w:r></w:p></w:tc></w:tr><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="3116" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>Body2</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="3117" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>Blah2</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="3117" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0"><w:r><w:t>Blah2</w:t></w:r></w:p></w:tc></w:tr></w:tbl><w:p w:rsidR="006C4C5A" w:rsidRDefault="006C4C5A"/><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>Node Selection</w:t></w:r></w:p><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>With the select attribute of xsl:apply-templates command, an XPath description can be used to either (1) select a multiple number of nodes with identical names, or (2) select a multiple number of nodes with differing names. Under scenario (1), using XPath to designate "ProductList/ Product" results in the selection of two Product element nodes.</w:t></w:r></w:p><w:tbl><w:tblPr><w:tblStyle w:val="TableGrid"/><w:tblW w:w="0" w:type="auto"/><w:tblLook w:val="04A0" w:firstRow="1" w:lastRow="0" w:firstColumn="1" w:lastColumn="0" w:noHBand="0" w:noVBand="1"/></w:tblPr><w:tblGrid><w:gridCol w:w="2383"/><w:gridCol w:w="2420"/><w:gridCol w:w="2194"/><w:gridCol w:w="2353"/></w:tblGrid><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="2383" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t xml:space="preserve">Table1 </w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2420" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>heading</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2194" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Col1</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2353" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Text</w:t></w:r></w:p></w:tc></w:tr><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="2383" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>body</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2420" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Blah</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2194" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Col1 Blah</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2353" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>blah</w:t></w:r></w:p></w:tc></w:tr><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="2383" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Body2</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2420" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Blah2</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2194" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Col1 Blah2</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2353" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="0042011C"><w:r><w:t>Blah2</w:t></w:r></w:p></w:tc></w:tr><w:tr w:rsidR="003404B0" w:rsidTr="003404B0"><w:tc><w:tcPr><w:tcW w:w="2383" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>B</w:t></w:r><w:r><w:t>ody3</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2420" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>Blah3</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2194" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>Col1 Blah3</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2353" w:type="dxa"/></w:tcPr><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"><w:r><w:t>Blah3</w:t></w:r><w:bookmarkStart w:id="0" w:name="_GoBack"/><w:bookmarkEnd w:id="0"/></w:p></w:tc></w:tr></w:tbl><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"/><w:p w:rsidR="003404B0" w:rsidRDefault="003404B0" w:rsidP="003404B0"/><w:sectPr w:rsidR="003404B0"><w:pgSz w:w="12240" w:h="15840"/><w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w:gutter="0"/><w:cols w:space="720"/><w:docGrid w:linePitch="360"/></w:sectPr></w:body></w:document>
答案 0 :(得分:1)
您不能不加区别地使用<xsl:apply-templates/>
,因为它也会应用默认模板 - 其中一个复制文本节点。另请注意,您在任何表之外都有与模板匹配的节点。试试这种方式:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="//w:tbl"/>
</xsl:template>
<xsl:template match="w:tbl">
<xsl:apply-templates select="w:tr"/>
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="w:tr">
<xsl:apply-templates select="w:tc"/>
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="w:tc">
<xsl:apply-templates select=".//w:t"/>
<xsl:if test="position()!=last()">
<xsl:text>	</xsl:text>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
或:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="//w:tbl">
<xsl:for-each select="w:tr">
<xsl:for-each select="w:tc">
<xsl:value-of select=".//w:t"/>
<xsl:if test="position()!=last()">
<xsl:text>	</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text> </xsl:text>
</xsl:for-each>
<xsl:text> </xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
-
两者都将生成制表符分隔的输出。请注意,假设表格单元格本身不包含制表符。