如何使用Xquery从两个XML文件中获取数据?

时间:2012-08-21 13:58:32

标签: xquery basex

Sample1.xml

<data><row><id>949459</id><product_id>4119945117</product_id></row>
    <row><id>781351</id><product_id>1009460692</product_id></row>
    <row><id>780163</id><product_id>1009755673</product_id></row>
    <row><id>1017226</id><product_id>1013393868</product_id></row>
    <row><id>1017956</id><product_id>1013393871</product_id></row>
    <row><id>1017310</id><product_id>1013393874</product_id></row>
    <row><id>771708</id><product_id>4388803569</product_id></row>
    <row><id>3270790</id><product_id>1013679270</product_id></row>
    <row><id>775869</id><product_id>1014142699</product_id></row>
    <row><id>1017599</id><product_id>1021870484</product_id></row>
    <row><id>1018789</id><product_id>1021870489</product_id></row>
    <row><id>1017091</id><product_id>1021870491</product_id></row>
    <row><id>1017333</id><product_id>1021870492</product_id></row>
    <row><id>1017630</id><product_id>1021870493</product_id></row>
    <row><id>1017774</id><product_id>1021870495</product_id></row>
    <row><id>1018192</id><product_id>1021870496</product_id></row>
    <row><id>1018725</id><product_id>4408034849</product_id></row>
    <row><id>1017990</id><product_id>1021870498</product_id></row>
    <row><id>1018027</id><product_id>1021870499</product_id></row>
    <row><id>1017166</id><product_id>1021870500</product_id></row>
    <row><id>769120</id><product_id>1032140806</product_id></row>
    <row><id>950336</id><product_id>1035310069</product_id></row>
    </data>

sample2.xml

<productSet>
 <row><product>4388803569</product></row>
 <row><product>4408034289</product></row>
 <row><product>4408034589</product></row>
 <row><product>4408034849</product></row>
 <row><product>4094557957</product></row>
 <row><product>4119945117</product></row>
</productSet>

这是我的Xquery,它通过与样本2文件中的product_id元素进行比较,从sample1 XML文件中返回所有产品元素值。在这里,我试图检索所有不可用的产品。

<outProduct_10310>
{ for $b in doc("sample1.xml")/data/row,
      $a in doc("sample2.xml")/productSet/row[product != $b/product_id]
  return   
      <op_id> { $b/product_id/text() } </op_id>
  }  </outProduct_10310>

当我将这两个XML用相等的符号进行比较但是我想要检索不匹配的数据时,此代码将返回正确的匹配数据,但如果我在代码上运行它会给出以下输出:

<outProduct_10310>
  <op_id>4119945117</op_id>
  <op_id>4119945117</op_id>
  <op_id>4119945117</op_id>
  <op_id>4119945117</op_id>
  <op_id>4119945117</op_id>
  <op_id>1009460692</op_id>
  <op_id>1009460692</op_id>
  <op_id>1009460692</op_id>
  <op_id>1009460692</op_id>
  <op_id>1009460692</op_id>
  <op_id>1009460692</op_id>
  <op_id>1009755673</op_id>
  <op_id>1009755673</op_id>
  <op_id>1009755673</op_id>
  <op_id>1009755673</op_id>
  <op_id>1009755673</op_id>
  <op_id>1009755673</op_id>
  <op_id>1013393868</op_id>
  <op_id>1013393868</op_id>
  <op_id>1013393868</op_id>
  <op_id>1013393868</op_id>
  <op_id>1013393868</op_id>
  <op_id>1013393868</op_id>
  <op_id>1013393871</op_id>
  <op_id>1013393871</op_id>
  <op_id>1013393871</op_id>
  <op_id>1013393871</op_id>
  <op_id>1013393871</op_id>
  <op_id>1013393871</op_id>
  <op_id>1013393874</op_id>
  <op_id>1013393874</op_id>
  <op_id>1013393874</op_id>
  <op_id>1013393874</op_id>
  <op_id>1013393874</op_id>
  <op_id>1013393874</op_id>
  <op_id>4388803569</op_id>
  <op_id>4388803569</op_id>
  <op_id>4388803569</op_id>
  <op_id>4388803569</op_id>
  <op_id>4388803569</op_id>
  <op_id>1013679270</op_id>
  <op_id>1013679270</op_id>
  <op_id>1013679270</op_id>
  <op_id>1013679270</op_id>
  <op_id>1013679270</op_id>
  <op_id>1013679270</op_id>
  <op_id>1014142699</op_id>
  <op_id>1014142699</op_id>
  <op_id>1014142699</op_id>
  <op_id>1014142699</op_id>
  <op_id>1014142699</op_id>
  <op_id>1014142699</op_id>
  <op_id>1021870484</op_id>
  <op_id>1021870484</op_id>
  <op_id>1021870484</op_id>
  <op_id>1021870484</op_id>
  <op_id>1021870484</op_id>
  <op_id>1021870484</op_id>
  <op_id>1021870489</op_id>
  <op_id>1021870489</op_id>
  <op_id>1021870489</op_id>
  <op_id>1021870489</op_id>
  <op_id>1021870489</op_id>
  <op_id>1021870489</op_id>
  <op_id>1021870491</op_id>
  <op_id>1021870491</op_id>
  <op_id>1021870491</op_id>
  <op_id>1021870491</op_id>
  <op_id>1021870491</op_id>
  <op_id>1021870491</op_id>
  <op_id>1021870492</op_id>
  <op_id>1021870492</op_id>
  <op_id>1021870492</op_id>
  <op_id>1021870492</op_id>
  <op_id>1021870492</op_id>
  <op_id>1021870492</op_id>
  <op_id>1021870493</op_id>
  <op_id>1021870493</op_id>
  <op_id>1021870493</op_id>
  <op_id>1021870493</op_id>
  <op_id>1021870493</op_id>
  <op_id>1021870493</op_id>
  <op_id>1021870495</op_id>
  <op_id>1021870495</op_id>
  <op_id>1021870495</op_id>
  <op_id>1021870495</op_id>
  <op_id>1021870495</op_id>
  <op_id>1021870495</op_id>
  <op_id>1021870496</op_id>
  <op_id>1021870496</op_id>
  <op_id>1021870496</op_id>
  <op_id>1021870496</op_id>
  <op_id>1021870496</op_id>
  <op_id>1021870496</op_id>
  <op_id>4408034849</op_id>
  <op_id>4408034849</op_id>
  <op_id>4408034849</op_id>
  <op_id>4408034849</op_id>
  <op_id>4408034849</op_id>
  <op_id>1021870498</op_id>
  <op_id>1021870498</op_id>
  <op_id>1021870498</op_id>
  <op_id>1021870498</op_id>
  <op_id>1021870498</op_id>
  <op_id>1021870498</op_id>
  <op_id>1021870499</op_id>
  <op_id>1021870499</op_id>
  <op_id>1021870499</op_id>
  <op_id>1021870499</op_id>
  <op_id>1021870499</op_id>
  <op_id>1021870499</op_id>
  <op_id>1021870500</op_id>
  <op_id>1021870500</op_id>
  <op_id>1021870500</op_id>
  <op_id>1021870500</op_id>
  <op_id>1021870500</op_id>
  <op_id>1021870500</op_id>
  <op_id>1032140806</op_id>
  <op_id>1032140806</op_id>
  <op_id>1032140806</op_id>
  <op_id>1032140806</op_id>
  <op_id>1032140806</op_id>
  <op_id>1032140806</op_id>
  <op_id>1035310069</op_id>
  <op_id>1035310069</op_id>
  <op_id>1035310069</op_id>
  <op_id>1035310069</op_id>
  <op_id>1035310069</op_id>
  <op_id>1035310069</op_id>
</outProduct_10310>

我对xquery完全不熟悉。任何人都可以帮助我如何获得无与伦比的数据吗?

2 个答案:

答案 0 :(得分:2)

首先,您不需要将所有id对与for循环进行比较,因为如果提供序列,standard =运算符将搜索相等的值。

因此,找到所有匹配的产品的更简单方法是:

<outProduct_10310>
{ for $b in doc("sample1.xml")/data/row[product_id = doc("sample2.xml")/productSet/row/product]
  return   
      <op_id> { $b/product_id/text() } </op_id>
  }  
</outProduct_10310>

对于不匹配的产品,你可以选择与not运算符不匹配的所有产品:(注意你不能使用!=这里,因为总会有不同的id)

<outProduct_10310>
{ for $b in doc("sample1.xml")/data/row[not (product_id = doc("sample2.xml")/productSet/row/product) ]
  return   
      <op_id> { $b/product_id/text() } </op_id>
  }  
</outProduct_10310>

根据您的XQuery引擎有多好,您可能希望将doc("sample2.xml")/productSet/row移出[..]括号并将其存储在变量中。

答案 1 :(得分:1)

请注意,所有常规比较(=,!=,...)都将比较左操作数的所有项目 右操作数的所有项目。一旦比较成功(即产量true),结果也将成立。例如,(1,2) != (1,2)将返回true,因为单个比较1 != 2会产生true

在您的特定查询中,您可以尝试使用not(product = $b/product_id),如果没有单个比较产生true,则只返回true。