Question

我正在过滤一个包含儿童鞋类的大文件，男人和女人。

现在我想过滤掉某些类型的女鞋，下面的xpath有效，但是我正在使用的程序存在xpath长度限制。所以我想知道是否有更短/更有效的方法来构建这个xpath

/Products/Product[contains(CategoryPath/ProductCategoryPath,'Halbschuhe') and contains(CategoryPath/ProductCategoryPath,'Damen') or  contains(CategoryPath/ProductCategoryPath,'Sneaker') and contains(CategoryPath/ProductCategoryPath,'Damen') or contains(CategoryPath/ProductCategoryPath,'Ballerinas') and contains(CategoryPath/ProductCategoryPath,'Damen')]

编辑：添加了请求的文件样本

<Products>
    <!-- snip -->
    <Product ProgramID="4875" ArticleNumber="GO1-f05-0001-12">
        <CategoryPath>
            <ProductCategoryID>34857489</ProductCategoryID>
            <ProductCategoryPath>Damen &gt; Sale &gt; Schuhe &gt; Sneaker &gt; Sneaker Low</ProductCategoryPath>
            <AffilinetProductCategoryPath>Kleidung &amp; Accessoires?</AffilinetProductCategoryPath>
        </CategoryPath>
        <Price>
            <DisplayPrice>40.95 EUR</DisplayPrice>
            <Price>40.95</Price>
        </Price>
    </Product>
    <!-- snip -->
</Products>

Answer 1

如果你有XPath 2.0可用，你应该按照Ranon的建议尝试matches()功能甚至是tokenize()。

使用XPath 1.0，缩短表达式的一种方法可能是：

/Products/Product[
    CategoryPath/ProductCategoryPath[
        contains(., 'Damen')
            and (  contains(., 'Halbschuhe')
                or contains(.,    'Sneaker')
                or contains(., 'Ballerinas') )] ]

便于复制粘贴的便捷oneliner：

/Products/Product[CategoryPath/ProductCategoryPath[contains(.,'Damen') and (contains(.,'Halbschuhe') or contains(.,'Sneaker') or contains(.,'Ballerinas'))]]

我试图保留你的表达方式，没有任何改变应该以任何方式改变行为。

有一些甚至更短的解决方案必须对XML结构等进行假设，但是如果没有完整的上下文，我们无法看到那些隐藏的方式，所以我们不会这样做。

Answer 2

如果您的XPath引擎支持XPath 2.0，它可以以更方便（也可能更有效）的方式完成：

//Product[
  CategoryPath/ProductCategoryPath[
    tokenize(., '\s') = ('Halbschuhe', 'Sneaker', 'Ballerinas') and contains(., 'Damen')
  ]
]

fn:tokenize($string, $token)在正则表达式上拆分一个字符串（这里使用空格，你也可以只提供一个空格）。 =对基于集合的语义进行比较，因此如果左侧的任何字符串等于右侧的任何字符串，则返回true。

XPath和/或语法，编写此Xpath的任何更短的方法

2 个答案: