删除重复项 - 需要匹配商家名称和多个地址

时间:2015-09-28 18:35:39

标签: xml xslt xslt-1.0 xslt-2.0 xslt-grouping

我需要根据'businessName'和完整的匹配地址来识别和删除重复项。鉴于下面的XML,我希望id为1和3的客户端匹配,因为businessName匹配并且至少有一个地址匹配(address1,city,state postalCode ... address2 not included)。请注意,对于地址匹配,'postalCode'只需要匹配前5位...而不是+4压缩。

XSLT 2.0 OK(撒克逊企业版)

我会假设我会使用for-each-group但我很困惑如何在每个客户端有多个地址时处理地址匹配。我一直在玩跟随兄弟,但没有到达任何地方。任何解决方案或指针都赞赏。谢谢。

<xsl:for-each-group select="Clients/client" group-by="businessName">
</xsl:for-each-group>


<Clients>
    <client>
        <id>1</id>
        <businessName>ABC Tile</businessName>
        <addresses>
            <address>
                <address1>PO Box 1057</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
            <address>
                <address1>PO Box 621188</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
        </addresses>
    </client>
    <client>
        <id>2</id>
        <businessName>123 Tile</businessName>
        <addresses>
            <address>
                <address1>567 Main Street</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
        </addresses>
    </client>
    <client>
        <id>3</id>
        <businessName>ABC Tile</businessName>
        <addresses>
            <address>
                <address1>123 Main Street</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
            <address>
                <address1>PO Box 1057</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801235555</postalCode>
            </address>
        </addresses>
    </client>
</Clients>

这是客户端ID 1列出所有客户端ID匹配的理想结果。

<Clients>
    <client>
        <id>1</id>
        <clientMatch>3</clientMatch>
        <businessName>ABC Tile</businessName>
        <addresses>
            <address>
                <address1>PO Box 1057</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
            <address>
                <address1>PO Box 621188</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
        </addresses>
    </client>
    <client>
        <id>2</id>
        <businessName>123 Tile</businessName>
        <addresses>
            <address>
                <address1>567 Main Street</address1>
                <address2/>
                <city>Denver</city>
                <state>CO</state>
                <postalCode>801230000</postalCode>
            </address>
        </addresses>
    </client>
</Clients>

1 个答案:

答案 0 :(得分:1)

我认为您可以在for-each-group上使用businessName但是进一步使用该结构很困难,因为您要比较多个address中的至少一个是否匹配。所以我想出了http://xsltransform.net/gWvjQeP/1

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.com/mf"
  exclude-result-prefixes="xs mf">

<xsl:output indent="yes"/>

<xsl:function name="mf:key" as="xs:string">
    <xsl:param name="address" as="element(address)"/>
    <xsl:sequence select="concat($address/address1, '|', $address/city, '|', $address/state, '|', substring($address/postalCode, 1, 5))"/>
</xsl:function>

<xsl:template match="Clients">
    <xsl:copy>
        <xsl:for-each-group select="client" group-by="businessName">
          <xsl:for-each select="current-group()">
              <xsl:variable name="pos" as="xs:integer" select="position()"/>
              <xsl:if test="not(current-group()[position() lt $pos][addresses/address/mf:key(.) = current()/addresses/address/mf:key(.)])">
                  <xsl:copy>
                      <xsl:copy-of select="id"/>
                      <clientMatch>
                          <xsl:value-of select="current-group()[position() gt $pos][addresses/address/mf:key(.) = current()/addresses/address/mf:key(.)]/id" separator=", "/>
                      </clientMatch>
                      <xsl:copy-of select="* except id"/>
                  </xsl:copy>
              </xsl:if>
          </xsl:for-each>

        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>
</xsl:transform>

我不确定您是要输出所有匹配元素的所有address元素还是仅输出第一个元素的元素,您的问题仅显示第一个元素,因此当前样本会这样做。