XPath - 匹配可选的子元素或空文本

时间:2013-09-24 00:46:32

标签: ruby xpath

如果有orderEvents元素且该元素的子节点不是orderEvent或空文本,我有以下代码应该添加错误。测试用例可以更好地展示我的目标。

XPATH表达式适用于所有情况,除非orderEvents元素中有空格。请参阅下面的测试用例 validVendorPaymentFormat7 。这是我的XPATH表达式。

"./expectedVendorPaymentTransactions/orderEvents/node()[not(self::orderEvent) or self::text()[normalize-space(.)!='']]"

测试案例:

<TestData>
  <TestcaseList>
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat1</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents></orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat2</rootIdentifier>
    </Testcase>
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat3</rootIdentifier>
      <expectedVendorPaymentTransactions>
      </expectedVendorPaymentTransactions>
    </Testcase>
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat4</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents><orderEvent>SOME_EVENT</orderEvent></orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
     <Testcase>
      <rootIdentifier>validVendorPaymentFormat5</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents><orderEvent>SOME_EVENT</orderEvent></orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat6</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents><orderEvent>SOME_EVENT</orderEvent><orderEvent>SOME_EVENT</orderEvent></orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat7</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents>
          <orderEvent>SOME_EVENT</orderEvent>
        </orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
  </TestcaseList>
</TestData>

失败的案例是 validVendorPaymentFormat7

失败讯息:

[exec] Failure:
 [exec]   Order vendor payments format contains elements other than orderEvent for validVendorPaymentFormat7 ["\n          ", "\n        "].
 [exec]   <false> is not true.

将失败案例更改为:

        <Testcase>
          <rootIdentifier>validVendorPaymentFormat7</rootIdentifier>
          <expectedVendorPaymentTransactions>
            <orderEvents><orderEvent>SOME_EVENT</orderEvent></orderEvents>
          </expectedVendorPaymentTransactions>
        </Testcase>

传球结果。

不幸的是,http://www.freeformatter.com/xpath-tester.html测试用例#7会按预期返回一个空列表。

更新 - 添加代码
Ruby版本:1.9
irb -v = 0.9.6

测试文件:

require 'rexml/document'

xpath = "//expectedVendorPaymentTransactions/orderEvents/node()[not(self::orderEvent) or self::text()[translate(., ' &#10;', '')!='']]"
document = REXML::Document.new <<EOF 
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat7</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents>
          <orderEvent>SOME_EVENT</orderEvent>
        </orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
EOF

document2 = REXML::Document.new <<EOF 
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat7</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents><orderEvent>SOME_EVENT</orderEvent></orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
EOF

puts "1: #{REXML::XPath.match(document, xpath).inspect}"
puts "2: #{REXML::XPath.match(document2, xpath).inspect}"

输出:

irb(main):001:0> load './test/test_rexp.rb'
1: ["\n          ", "\n        "]
2: []
=> true

Jens的最新修订版:

irb(main):009:0> load './test/test_rexp.rb'
1: ", "
2: ", "
=> true

2 个答案:

答案 0 :(得分:1)

似乎Ruby并没有像normalize-space(...)中的空格一样处理换行符。如果它包含除空白之外的任何内容,您会感兴趣,所以只需删除所有空格。 translate(...)可以派上用场。第一个参数是你匹配的,第二个字符串是匹配的,第三个字符指示要替换的字符;如果它是空的,所有匹配的字符都将被删除。

translate(., ' &#10;', '')

我可以使用Perl的XPath重现这个问题,并通过此查询解决了这个问题:

/expectedVendorPaymentTransactions/orderEvents/node()[not(self::orderEvent) or self::text()[translate(., ' &#10;', '')!='']]

更新:Ruby似乎无法正确解析XML实体,但您可以使用\n代替换行:

translate(., ' \n', '')

答案 1 :(得分:1)

从XPATH中删除空格处理并使用相当简单的ruby解决方法。

工作代码:

require 'rexml/document'
require 'rexml/text'

xpath = "//expectedVendorPaymentTransactions/orderEvents/node()[not(self::orderEvent)]"
document = REXML::Document.new <<EOF 
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat7</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents>
          <orderEvent>Fulfill</orderEvent>
        </orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
EOF

document2 = REXML::Document.new <<EOF 
    <Testcase>
      <rootIdentifier>validVendorPaymentFormat7</rootIdentifier>
      <expectedVendorPaymentTransactions>
        <orderEvents><orderEvent>Fulfill</orderEvent></orderEvents>
      </expectedVendorPaymentTransactions>
    </Testcase>
EOF

elements = REXML::XPath.match(document, xpath)
elements.reject! { |element| 
      element.instance_of?(REXML::Text) and element.value.gsub(/[\n ]/,"") == ''
    }
puts "1: #{elements.inspect}"

elements2 = REXML::XPath.match(document2, xpath)
elements2.reject! { |element| 
      element.instance_of?(REXML::Text) and element.value.gsub(/[\n\t ]/,"") == ''
    }
puts "2: #{elements2.inspect}"