在pentaho中使用XML文件时查询问题

时间:2015-10-13 19:37:21

标签: xml pentaho

鉴于以下XML文件,如何获取特定类型的出现次数。例如count(流派)? 我在Pentaho Report Designer中编写这些查询。我发布了截图,这可能有助于了解其工作原理。

这里我给了一个XPath。

1 http://i60.tinypic.com/23jherk.png

在应用程序中输出就是这个。

2 http://i59.tinypic.com/2uqnqj4.png

提供类似字符串连接的查询(distinct-values(/ catalog / book / genre),',') 我收到错误

2 http://i59.tinypic.com/2jbkso2.png[/IMG]

<?xml version="1.0"?>
    <catalog>
    <book id="bk101">
        <author>Gambardella, Matthew</author>
        <title>XML Developer's Guide</title>
        <genre>Computer</genre>
        <price>44.95</price>
        <publish_date>2000-10-01</publish_date>
        <description>An in-depth look at creating applications with XML.</description>
        <image>http://i.telegraph.co.uk/multimedia/archive/02445/mars_2445397b.jpg</image>
    </book>
    <book id="bk102">
        <author>Ralls, Kim</author>
        <title>Midnight Rain</title>
        <genre>Fantasy</genre>
        <price>5.95</price>
        <publish_date>2000-12-16</publish_date>
        <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
        <image>https://upload.wikimedia.org/wikipedia/commons /8/85/Venus_globe.jpg</image>
    </book>
    <book id="bk103">
        <author>Corets, Eva</author>
        <title>Maeve Ascendant</title>
        <genre>Fantasy</genre>
        <price>5.95</price>
        <publish_date>2000-11-17</publish_date>
        <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description>
        <image>http://nssdc.gsfc.nasa.gov/image/planetary/saturn/saturn.jpg</image>
      </book>
      </catalog>

2 个答案:

答案 0 :(得分:0)

通用XQuery示例: 希望它能指导你一些。我不知道Pentaho Report Designer是什么。

    let $catalog := <catalog>
    <book id="bk101">
        <author>Gambardella, Matthew</author>
        <title>XML Developer's Guide</title>
        <genre>Computer</genre>
        <price>44.95</price>
        <publish_date>2000-10-01</publish_date>
        <description>An in-depth look at creating applications with XML.</description>
        <image>http://i.telegraph.co.uk/multimedia/archive/02445/mars_2445397b.jpg</image>
    </book>
    <book id="bk102">
        <author>Ralls, Kim</author>
        <title>Midnight Rain</title>
        <genre>Fantasy</genre>
        <price>5.95</price>
        <publish_date>2000-12-16</publish_date>
        <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
        <image>https://upload.wikimedia.org/wikipedia/commons /8/85/Venus_globe.jpg</image>
    </book>
    <book id="bk103">
        <author>Corets, Eva</author>
        <title>Maeve Ascendant</title>
        <genre>Fantasy</genre>
        <price>5.95</price>
        <publish_date>2000-11-17</publish_date>
        <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description>
        <image>http://nssdc.gsfc.nasa.gov/image/planetary/saturn/saturn.jpg</image>
    </book>
</catalog>

return 
  element {"genre-count"}{
  for $genre in fn:distinct-values($catalog/book/genre)
     return element {"genre"}{ 
       attribute {"name"}{$genre},
       fn:count($catalog/book/genre[.=$genre])
     }
    }

结果:

<genre-count>
  <genre name="Computer">1</genre>
  <genre name="Fantasy">2</genre>
</genre-count>

答案 1 :(得分:0)

与David Ennis一样,我不熟悉Pentaho的XQuery支持。从极少数提到的XQuery我已经能够找到Pentaho的文档,论坛和Github存储库,看来Pentaho使用了一个非常强大的XQuery引擎Saxon。 Saxon应该为您提供在XML数据上运行通用XPath和FLWOR表达式的能力,但Pentaho强加的限制或它假设的上下文对我来说是不清楚的。我建议从一个基本的XPath表达式开始 - 任何XQuery的构建块:

/catalog/book/title

这应该返回:

<title>XML Developer's Guide</title>
<title>Midnight Rain</title>
<title>Maeve Ascendant</title>

如果这返回了预期结果,请尝试使用以下表达式,该表达式添加标准库string-join()distinct-values()中的函数:

string-join(distinct-values(/catalog/book/genre), ', ')

这应该返回类似:

Computer, Fantasy

如果返回预期结果,请尝试使用FLWOR表达式:

for $genre in distinct-values(/catalog/book/genre)
let $books-in-genre := /catalog/book[genre = $genre]
return
    <genre label="{$genre}" book-count="{count($books-in-genre)}"/>

如果失败,您可能需要将结果包装在单个根节点中:

<genres>{
    for $genre in distinct-values(/catalog/book/genre)
    let $books-in-genre := /catalog/book[genre = $genre]
    return
        <genre label="{$genre}" book-count="{count($books-in-genre)}"/>
}</genres>

如果您遇到任何问题,请发布您收到的任何错误消息,这可能会有所帮助。