从关系表到嵌套的XML输出文件

时间:2010-02-23 10:04:41

标签: xml db2 xquery

Using IBM DB2 database, I have a three relational tables:

    Project: id, title, description
    Topic: projectId, value
    Tag: projectId, value

I need to produce the following XML file from the previous table: 

    <projects>
      <project id="project1">
        <title>title1</title>
        <description>desc1</description>
        <topic>topic1</topic>
        <topic>topic2</topic>
        <tag>tag1</tag>
        <tag>tag2</tag>
        <tag>tag3</tag>  
      </project>
      ...
    </projects>


I've tried the following query, and it works:

    XQUERY 
    let $projects := db2-fn:sqlquery('SELECT XMLELEMENT(NAME "project", XMLATTRIBUTES(id, title, description)) AS project FROM mydb.Project')  
    let $TopicSet := db2-fn:sqlquery('SELECT XMLELEMENT(NAME "row", XMLATTRIBUTES(projectId, value)) FROM mydb.Topic')
    let $TagSet := db2-fn:sqlquery('SELECT XMLELEMENT(NAME "row", XMLATTRIBUTES(projectId, value)) FROM mydb.Tag')

    for $project in $projects return
    <project>
    {$project/@ID}
    <title>{$project/fn:string(@TITLE)}</title>
    <description>{$project/fn:string(@DESCRIPTION)}</description>
    {for $row in $TopicSet[@PROJECTID=$project/@ID] return <Topic>{$row/fn:string(@VALUE)}</Topic>}
    {for $row in $TagSet[@PROJECTID=$project/@ID] return <Tag>{$row/fn:string(@VALUE)}</Tag>}
    </project>
    ;

However, it took 9 hours to complete (there 200k projects in the table)

How can I improve that?
Do I really need to create the three intermediate db2-fn:sqlquery to achieve this? is there another way?
Would it be faster if I create these 3 three intermediate db2-fn:sqlquery and put them in a table (with only one row and one attribute), and then index this before querying the "for $project in $projects return" part?


Or, how would you proceed to achieve my goal?


Best regards,
David

---
As proposed by Peter Schuetze, I tried the XMLAGG as follows:
SELECT 
XMLSERIALIZE(
  XMLDOCUMENT(
    XMLELEMENT(
      NAME "Project",
      XMLATTRIBUTES(P.project), 
      XMLAGG(XMLELEMENT(NAME "Topic", Topic.value)),
      XMLAGG(XMLELEMENT(NAME "Tag", Tag.value)),
    ) 
  ) AS CLOB(1M)
) 
FROM mydb.project P 
LEFT JOIN mydb.Topic Topic ON (P.project = Topic.project) 
LEFT JOIN mydb.Tag Tag ON (P.project = Tag.project) 
GROUP BY P.project;

This works indeed much much faster!
However, if a project has not any topic, it will still display topic element, with a blank text, such as:
    <projects>
      <project id="project1">
        <title>title1</title>
        <description>desc1</description>
        <topic></topic>
        <tag>tag1</tag>
        <tag>tag2</tag>
        <tag>tag3</tag>  
      </project>
      ...
    </projects>
How to remove this "<topic></topic>"?

2 个答案:

答案 0 :(得分:1)

如果列可能为NULL并且在发生这种情况时您不想要空元素标记,请使用XMLFOREST而不是XMLELEMENT。因此,对于主题,您将用

替换其XMLELEMENT函数
XMLFOREST( Topic.value AS "topic" )

在同一SELECT语句中包含两个XMLAGG函数的方式存在问题。如果您的语句中只有一个XMLAGG,则没有问题,因为父键上的GROUP BY将整齐地折叠XMLAGG中指定的子条目。但是,当您在同一SELECT中指定多个XMLAGG函数时,查询会在内部生成笛卡尔积,因此在这种情况下,您将看到XMLAGG返回的每个组内的重复项。您给出的项目只有零个或一个主题的例子没有证明这个问题,但如果一个项目有两个主题和三个标签,你会看到每个主题重复三次,每个标签重复两次。为了防止这种情况,您需要将每个XMLAGG重定位到子查询或生成单个XML片段的公用表表达式,以便您可以安全地从主查询中引用它。

下面是将XMLAGG推送到公用表表达式的示例。它还消除了对XMLFOREST的需求,因为XMLAGG不会为空输入集产生任何结果。

WITH 
topicxml( projectid, xmlfragment ) AS (
SELECT topic.projectid, 
XMLAGG( XMLELEMENT( NAME "topic", topic.value ) ORDER BY topic.value)
FROM mydb.topic topic 
GROUP BY topic.projectid
),
tagxml ( projectid, xmlfragment ) AS (
SELECT projectid, 
XMLAGG( XMLELEMENT( NAME "tag", tag.value ) ORDER BY tag.value)
FROM mydb.tag tag
GROUP BY tag.projectid
)
SELECT XMLSERIALIZE ( CONTENT XMLELEMENT( NAME "project",
XMLATTRIBUTES( p.id AS "id" ),
XMLELEMENT( NAME "title", p.title ),
XMLELEMENT( NAME "description", p.description ),
XMLCONCAT( topicxml.xmlfragment, tagxml.xmlfragment )
) AS VARCHAR(2000) )
FROM mydb.project p
LEFT OUTER JOIN topicxml ON topicxml.projectid = p.id
LEFT OUTER JOIN tagxml ON tagxml.projectid = p.id
;

答案 1 :(得分:0)

查看XMLAGG函数。这应该是您的需要。我还没有尝试过,但链接页面上的示例几乎就是您想要做的。