将以下xml文件作为输入:
<?xml ="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
<DATA RECORDS="2">
<RECORD ID="1">
<RECNO>1</RECNO>
<SEQ>0</SEQ>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10355</NUMBER>
<CN>PL</CN>
<PROPERTY>0</PROPERTY>
<DAYS>0</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>1</TOTALS>
<SUB_A>Some (random) data.</SUB_A>
</RECORD>
<RECORD ID="2">
<RECNO>2</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10356</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>19837</TOTALS>
</RECORD>
<RECORD ID="3">
<RECNO>3</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10357</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<SUB_A>Some [more] data.</SUB_A>
</RECORD>
</DATA>
</TABLE>
我想根据以下标准将数据插入SUB_A元素: 我有一个标签分隔文件,
value A NUMBER TOTALS
其中值A必须在SUB_A元素中输入,并带有|作为分隔符,如果SUB_A已存在于记录中,或者要创建,如果它不存在。 NUMBER和TOTALS是我们需要的,以便在转换将发生的记录上进行匹配。 TOTALS值可能根本不存在于我们的输入txt文件中:
所以我们的transform.txt文件如下所示:
编辑:
先生亲切地使用解决方案先生。 Honnen建议,建议输入数据,当我输入时:value i'd like to add 10355 1
another) value i'd like to add 10357 19837
yet another value.] 10358
test 10354
我收到以下错误:
Tree size: 113 nodes, 392 characters, 7 attributes
<line number="10355" totals="1">value i'd like to add</line><line number="10357" totals="19837">another) value i'd like to add</line><line number="10358" totals="">yet another value.]</line><line number="10354" totals="">test</line>
Error at char 6 in xsl:merge-source/@select on line 26 column 65 of merge.xsl:
XTDE2220: Merge input for source line is not ordered according to merge key, detected at
key value: ["10354", ""]
in built-in template rule for /TABLE/DATA[1] in the unnamed mode
at xsl:apply-templates (/merge.xsl#21)
processing /TABLE
Merge input for source line is not ordered according to merge key, detected at key value: ["10354", ""]
预期结果:
<?xml ="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
<DATA RECORDS="2">
<RECORD ID="1">
<RECNO>1</RECNO>
<SEQ>0</SEQ>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10355</NUMBER>
<CN>PL</CN>
<PROPERTY>0</PROPERTY>
<DAYS>0</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>1</TOTALS>
<SUB_A>Some (random) data. | value i'd like to add</SUB_A>
</RECORD>
<RECORD ID="2">
<RECNO>2</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10356</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>19837</TOTALS>
<SUB_A>(another) value i'd like to add</SUB_A>
</RECORD>
<RECORD ID="3">
<RECNO>3</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10357</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<SUB_A>Some [more] data. | [yet another value.]</SUB_A>
</RECORD>
</DATA>
</TABLE>
我所尝试过的,会让问题变得更大,而且还没有让我接近我所期待的......
答案 0 :(得分:0)
所以我尝试使用xsl:merge中的XSLT 3.0来实现它,要处理丢失总计值的可能性并不容易,因为你不能使用合并键部分比较。如果在制表符字符之外还有任何空格,例如在示例选项卡数据中,在最后一个数字后面有空格并且弄乱了合并键,那么将文本数据转换为XML数据也很容易弄乱合并。虽然可以通过使用analyze-string
代替tokenize
来提取数据,但可以通过规范化空间使用来修复,或者更好。
我想出的是
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math"
expand-text="yes" version="3.0">
<xsl:param name="text-uri" as="xs:string">test2017092101.txt</xsl:param>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="lines" as="element(line)*">
<xsl:apply-templates select="unparsed-text-lines($text-uri)"/>
</xsl:variable>
<xsl:template match=".[. instance of xs:string]">
<xsl:variable name="tokens" as="xs:string*" select="tokenize(., '	')[normalize-space()]"/>
<line number="{$tokens[2]}" totals="{$tokens[3]}">{$tokens[1]}</line>
</xsl:template>
<xsl:template match="TABLE/DATA">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:merge>
<xsl:merge-source name="record" select="RECORD">
<xsl:merge-key select="NUMBER"/>
<xsl:merge-key select="string(TOTALS)"/>
</xsl:merge-source>
<xsl:merge-source name="line" select="$lines">
<xsl:merge-key select="@number"/>
<xsl:merge-key select="string(@totals)"/>
</xsl:merge-source>
<xsl:merge-action>
<xsl:if test="current-merge-group('record')">
<xsl:copy>
<xsl:apply-templates select="* except SUB_A"/>
<SUB_A>
<xsl:value-of select="SUB_A, current-merge-group()[2]" separator="|"
/>
</SUB_A>
</xsl:copy>
</xsl:if>
</xsl:merge-action>
</xsl:merge>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
我试过的样本是
value i'd like to add 10355 1
(another) value i'd like to add 10357 19837
[yet another value.] 10358
输入为
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
<DATA RECORDS="2">
<RECORD ID="0">
<RECNO>0</RECNO>
<SEQ>0</SEQ>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10354</NUMBER>
<CN>PL</CN>
<PROPERTY>0</PROPERTY>
<DAYS>0</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>1</TOTALS>
<SUB_A>Some random data not matched.</SUB_A>
</RECORD>
<RECORD ID="1">
<RECNO>1</RECNO>
<SEQ>0</SEQ>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10355</NUMBER>
<CN>PL</CN>
<PROPERTY>0</PROPERTY>
<DAYS>0</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>1</TOTALS>
<SUB_A>Some (random) data.</SUB_A>
</RECORD>
<RECORD ID="2">
<RECNO>2</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10356</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<SUB_A>Some random data not matched.</SUB_A>
</RECORD>
<RECORD ID="3">
<RECNO>3</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10357</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>19837</TOTALS>
</RECORD>
<RECORD ID="3">
<RECNO>3</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10358</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<SUB_A>Some [more] data.</SUB_A>
</RECORD>
</DATA>
</TABLE>
然后结果
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
<DATA RECORDS="2">
<RECORD>
<RECNO>0</RECNO>
<SEQ>0</SEQ>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10354</NUMBER>
<CN>PL</CN>
<PROPERTY>0</PROPERTY>
<DAYS>0</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>1</TOTALS>
<SUB_A>Some random data not matched.</SUB_A>
</RECORD>
<RECORD>
<RECNO>1</RECNO>
<SEQ>0</SEQ>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10355</NUMBER>
<CN>PL</CN>
<PROPERTY>0</PROPERTY>
<DAYS>0</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>1</TOTALS>
<SUB_A>Some (random) data.|value i'd like to add</SUB_A>
</RECORD>
<RECORD>
<RECNO>2</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10356</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<SUB_A>Some random data not matched.</SUB_A>
</RECORD>
<RECORD>
<RECNO>3</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10357</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<TOTALS>19837</TOTALS>
<SUB_A>(another) value i'd like to add</SUB_A>
</RECORD>
<RECORD>
<RECNO>3</RECNO>
<SEQUENCE>0</SEQUENCE>
<DATE>17/12/1999 2:44:08 μμ</DATE>
<ID>12/11/2015 3:15:25 μμ</ID>
<NUMBER>10358</NUMBER>
<CN>PL 300 L</CN>
<PROPERTY>0</PROPERTY>
<DAYS>10</DAYS>
<CURRENTSTATUS>0</CURRENTSTATUS>
<SUB_A>Some [more] data.|[yet another value.]</SUB_A>
</RECORD>
</DATA>
</TABLE>
当然也应该可以使用分组而不是合并来解决这个问题。将表格数据转换为XML的代码将保持不变。
这有点乱,但是为了不在注释中交换代码,请尝试替换模板以使用
创建line
元素
<xsl:template match=".[. instance of xs:string]">
<xsl:variable name="groups" select="analyze-string(., '([^0-9]*?)\s*([0-9]+)(\s*([0-9]+))?')//*:group"/>
<line number="{$groups[2]}" totals="{$groups[4]}">{$groups[1]}</line>
</xsl:template>
尝试使用analyze-string
代替tokenize
,但假设第一列不包含任何数字,第二列和可选第三列由数字组成。