根据记录匹配条件

时间:2017-09-21 13:51:50

标签: xml xslt-2.0

将以下xml文件作为输入:

<?xml ="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
    <DATA RECORDS="2">
        <RECORD ID="1">
            <RECNO>1</RECNO>
            <SEQ>0</SEQ>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10355</NUMBER>
            <CN>PL</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>0</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>1</TOTALS>
            <SUB_A>Some (random) data.</SUB_A>
        </RECORD>
        <RECORD ID="2">
            <RECNO>2</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10356</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>19837</TOTALS>
        </RECORD>
        <RECORD ID="3">
            <RECNO>3</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10357</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <SUB_A>Some [more] data.</SUB_A>
        </RECORD>
    </DATA>
</TABLE>

我想根据以下标准将数据插入SUB_A元素: 我有一个标签分隔文件,

value A NUMBER  TOTALS

其中值A必须在SUB_A元素中输入,并带有|作为分隔符,如果SUB_A已存在于记录中,或者要创建,如果它不存在。 NUMBER和TOTALS是我们需要的,以便在转换将发生的记录上进行匹配。 TOTALS值可能根本不存在于我们的输入txt文件中:

所以我们的transform.txt文件如下所示:

编辑:

先生亲切地使用解决方案先生。 Honnen建议,建议输入数据,当我输入时:

value i'd like to add   10355   1
another) value i'd like to add  10357   19837
yet another value.] 10358
test    10354

我收到以下错误:

       Tree size: 113 nodes, 392 characters, 7 attributes
<line number="10355" totals="1">value i'd like to add</line><line number="10357" totals="19837">another) value i'd like to add</line><line number="10358" totals="">yet another value.]</line><line number="10354" totals="">test</line>
Error at char 6 in xsl:merge-source/@select on line 26 column 65 of merge.xsl:
  XTDE2220: Merge input for source line is not ordered according to merge key, detected at
  key value: ["10354", ""]
  in built-in template rule for /TABLE/DATA[1] in the unnamed mode
  at xsl:apply-templates (/merge.xsl#21)
     processing /TABLE
Merge input for source line is not ordered according to merge key, detected at key value: ["10354", ""]

预期结果:

<?xml ="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
    <DATA RECORDS="2">
        <RECORD ID="1">
            <RECNO>1</RECNO>
            <SEQ>0</SEQ>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10355</NUMBER>
            <CN>PL</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>0</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>1</TOTALS>
            <SUB_A>Some (random) data. | value i'd like to add</SUB_A>
        </RECORD>
        <RECORD ID="2">
            <RECNO>2</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10356</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>19837</TOTALS>
            <SUB_A>(another) value i'd like to add</SUB_A>
        </RECORD>
        <RECORD ID="3">
            <RECNO>3</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10357</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <SUB_A>Some [more] data. | [yet another value.]</SUB_A>
        </RECORD>
    </DATA>
</TABLE>

我所尝试过的,会让问题变得更大,而且还没有让我接近我所期待的......

1 个答案:

答案 0 :(得分:0)

所以我尝试使用xsl:merge中的XSLT 3.0来实现它,要处理丢失总计值的可能性并不容易,因为你不能使用合并键部分比较。如果在制表符字符之外还有任何空格,例如在示例选项卡数据中,在最后一个数字后面有空格并且弄乱了合并键,那么将文本数据转换为XML数据也很容易弄乱合并。虽然可以通过使用analyze-string代替tokenize来提取数据,但可以通过规范化空间使用来修复,或者更好。

我想出的是

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math"
    expand-text="yes" version="3.0">

    <xsl:param name="text-uri" as="xs:string">test2017092101.txt</xsl:param>

    <xsl:mode on-no-match="shallow-copy"/>

    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:variable name="lines" as="element(line)*">
        <xsl:apply-templates select="unparsed-text-lines($text-uri)"/>
    </xsl:variable>

    <xsl:template match=".[. instance of xs:string]">
        <xsl:variable name="tokens" as="xs:string*" select="tokenize(., '&#9;')[normalize-space()]"/>
        <line number="{$tokens[2]}" totals="{$tokens[3]}">{$tokens[1]}</line>
    </xsl:template>

    <xsl:template match="TABLE/DATA">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:merge>
                <xsl:merge-source name="record" select="RECORD">
                    <xsl:merge-key select="NUMBER"/>
                    <xsl:merge-key select="string(TOTALS)"/>
                </xsl:merge-source>
                <xsl:merge-source name="line" select="$lines">
                    <xsl:merge-key select="@number"/>
                    <xsl:merge-key select="string(@totals)"/>
                </xsl:merge-source>
                <xsl:merge-action>
                    <xsl:if test="current-merge-group('record')">
                        <xsl:copy>
                            <xsl:apply-templates select="* except SUB_A"/>
                            <SUB_A>
                                <xsl:value-of select="SUB_A, current-merge-group()[2]" separator="|"
                                />
                            </SUB_A>
                        </xsl:copy>
                    </xsl:if>
                </xsl:merge-action>
            </xsl:merge>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

我试过的样本是

value i'd like to add   10355   1
(another) value i'd like to add 10357   19837
[yet another value.]    10358

输入为

<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
    <DATA RECORDS="2">
        <RECORD ID="0">
            <RECNO>0</RECNO>
            <SEQ>0</SEQ>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10354</NUMBER>
            <CN>PL</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>0</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>1</TOTALS>
            <SUB_A>Some random data not matched.</SUB_A>
        </RECORD>
        <RECORD ID="1">
            <RECNO>1</RECNO>
            <SEQ>0</SEQ>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10355</NUMBER>
            <CN>PL</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>0</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>1</TOTALS>
            <SUB_A>Some (random) data.</SUB_A>
        </RECORD>
        <RECORD ID="2">
            <RECNO>2</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10356</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <SUB_A>Some random data not matched.</SUB_A>
        </RECORD>
        <RECORD ID="3">
            <RECNO>3</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10357</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <TOTALS>19837</TOTALS>
        </RECORD>
        <RECORD ID="3">
            <RECNO>3</RECNO>
            <SEQUENCE>0</SEQUENCE>
            <DATE>17/12/1999 2:44:08 μμ</DATE>
            <ID>12/11/2015 3:15:25 μμ</ID>
            <NUMBER>10358</NUMBER>
            <CN>PL 300 L</CN>
            <PROPERTY>0</PROPERTY>
            <DAYS>10</DAYS>
            <CURRENTSTATUS>0</CURRENTSTATUS>
            <SUB_A>Some [more] data.</SUB_A>
        </RECORD>
    </DATA>
</TABLE>

然后结果

<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="TABLE.DB">
   <DATA RECORDS="2">
      <RECORD>
         <RECNO>0</RECNO>
         <SEQ>0</SEQ>
         <DATE>17/12/1999 2:44:08 μμ</DATE>
         <ID>12/11/2015 3:15:25 μμ</ID>
         <NUMBER>10354</NUMBER>
         <CN>PL</CN>
         <PROPERTY>0</PROPERTY>
         <DAYS>0</DAYS>
         <CURRENTSTATUS>0</CURRENTSTATUS>
         <TOTALS>1</TOTALS>
         <SUB_A>Some random data not matched.</SUB_A>
      </RECORD>
      <RECORD>
         <RECNO>1</RECNO>
         <SEQ>0</SEQ>
         <DATE>17/12/1999 2:44:08 μμ</DATE>
         <ID>12/11/2015 3:15:25 μμ</ID>
         <NUMBER>10355</NUMBER>
         <CN>PL</CN>
         <PROPERTY>0</PROPERTY>
         <DAYS>0</DAYS>
         <CURRENTSTATUS>0</CURRENTSTATUS>
         <TOTALS>1</TOTALS>
         <SUB_A>Some (random) data.|value i'd like to add</SUB_A>
      </RECORD>
      <RECORD>
         <RECNO>2</RECNO>
         <SEQUENCE>0</SEQUENCE>
         <DATE>17/12/1999 2:44:08 μμ</DATE>
         <ID>12/11/2015 3:15:25 μμ</ID>
         <NUMBER>10356</NUMBER>
         <CN>PL 300 L</CN>
         <PROPERTY>0</PROPERTY>
         <DAYS>10</DAYS>
         <CURRENTSTATUS>0</CURRENTSTATUS>
         <SUB_A>Some random data not matched.</SUB_A>
      </RECORD>
      <RECORD>
         <RECNO>3</RECNO>
         <SEQUENCE>0</SEQUENCE>
         <DATE>17/12/1999 2:44:08 μμ</DATE>
         <ID>12/11/2015 3:15:25 μμ</ID>
         <NUMBER>10357</NUMBER>
         <CN>PL 300 L</CN>
         <PROPERTY>0</PROPERTY>
         <DAYS>10</DAYS>
         <CURRENTSTATUS>0</CURRENTSTATUS>
         <TOTALS>19837</TOTALS>
         <SUB_A>(another) value i'd like to add</SUB_A>
      </RECORD>
      <RECORD>
         <RECNO>3</RECNO>
         <SEQUENCE>0</SEQUENCE>
         <DATE>17/12/1999 2:44:08 μμ</DATE>
         <ID>12/11/2015 3:15:25 μμ</ID>
         <NUMBER>10358</NUMBER>
         <CN>PL 300 L</CN>
         <PROPERTY>0</PROPERTY>
         <DAYS>10</DAYS>
         <CURRENTSTATUS>0</CURRENTSTATUS>
         <SUB_A>Some [more] data.|[yet another value.]</SUB_A>
      </RECORD>
   </DATA>
</TABLE>

当然也应该可以使用分组而不是合并来解决这个问题。将表格数据转换为XML的代码将保持不变。

这有点乱,但是为了不在注释中交换代码,请尝试替换模板以使用

创建line元素
<xsl:template match=".[. instance of xs:string]">
    <xsl:variable name="groups" select="analyze-string(., '([^0-9]*?)\s*([0-9]+)(\s*([0-9]+))?')//*:group"/>
    <line number="{$groups[2]}" totals="{$groups[4]}">{$groups[1]}</line>
</xsl:template>

尝试使用analyze-string代替tokenize,但假设第一列不包含任何数字,第二列和可选第三列由数字组成。