使用xpath从postgres中的XML列中提取数据

时间:2013-07-23 00:53:27

标签: xml postgresql xpath xml-parsing

我做了下表:

create table temp.promotions_xml(id serial promotion_xml xml);

我已将以下数据插入temp.promotions:

<promotions xmlns="http://www.demandware.com/xml/impex/promotion/2008-01-31">
    <campaign campaign-id="2013-1st-semester-jet-giveaways">
        <description>2013 1st Semester Jet Giveaways</description>
        <enabled-flag>true</enabled-flag>
        <start-date>2013-01-01T05:00:00.000Z</start-date>
        <end-date>2013-07-01T04:00:00.000Z</end-date>
        <customer-groups>
            <customer-group group-id="Everyone"/>
        </customer-groups>
    </campaign>
</promotions>

数据在表格中。

我无法弄明白如何解决这个问题。我可能希望能够填充我将构建的关系模型,所以我想摆脱所有标签。

以下是我尝试过的一些不起作用的查询。我很确定我只是围绕正确的语法跳舞。这些查询返回空集行。

FWIW,我们正在使用Postgres 9.0.4。

谢谢, - sw

select xpath('/promotions/campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('./promotions/campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('promotions/campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('///description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('//description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('.//description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('./campaign/description/text()',promotion_xml) textcol from temp.promotions_xml
select xpath('//campaign/description/text()',promotion_xml) textcol from temp.promotions_xml

1 个答案:

答案 0 :(得分:16)

这有效:

WITH tbl(p_xml) AS (  -- CTE just to provide test table with xml value
   SELECT '<promotions xmlns="http://www.demandware.com/xml/impex/promotion/2008-01-31">
              <campaign campaign-id="2013-1st-semester-jet-giveaways">
                 <description>2013 1st Semester Jet Giveaways</description>
                 <enabled-flag>true</enabled-flag>
                 <start-date>2013-01-01T05:00:00.000Z</start-date>
                 <end-date>2013-07-01T04:00:00.000Z</end-date>
                 <customer-groups>
                    <customer-group group-id="Everyone"/>
                 </customer-groups>
              </campaign>
           </promotions>'::xml
    )  -- end of CTE, the rest is the solution
SELECT xpath('/n:promotions/n:campaign/n:description/text()', p_xml
           , '{{n,http://www.demandware.com/xml/impex/promotion/2008-01-31}}')
FROM   tbl;

返回:

{"2013 1st Semester Jet Giveaways"}

请注意我如何为third argument of xpath()中的命名空间分配命名空间别名 n,并在xpath的每个级别使用它。

如果从文档中删除XML命名空间,一切都变得简单了:

WITH tbl(p_xml) AS (  -- not the missing namespace below
   SELECT '<promotions>
              <campaign campaign-id="2013-1st-semester-jet-giveaways">
                 <description>2013 1st Semester Jet Giveaways</description>
                 <enabled-flag>true</enabled-flag>
                 <start-date>2013-01-01T05:00:00.000Z</start-date>
                 <end-date>2013-07-01T04:00:00.000Z</end-date>
                 <customer-groups>
                    <customer-group group-id="Everyone"/>
                 </customer-groups>
              </campaign>
           </promotions>'::xml
   )
SELECT xpath('/promotions/campaign/description/text()', p_xml)
FROM   tbl;

<rant>仅仅是我或者每个人都对json and jsonb感到高兴,因此我们不必处理XML。</rant>