Question

我有一个文本字段，其中包含PostgreSQL数据库中的HTML表。我想从该字段中提取一些数据：行中的每个TR（而不是标题中的）和列中的每个TD。有可能吗？

表的名称为“ documentos”，而包含HTML表的文本字段的名称为“ props”。道具包含以下内容：

select props
from documentos
where uidref = 'ee41f201-0049-41e9-9c5d-5c35e2cf73ac'

我想获得：

444444444 | Investigador | Daniel | Perez
555555555 | Becario      | Jorge  | Fernandez

谢谢！

Answer 1

我没有使用PostreSQL的经验，也没有使用XPATH的经验，但是我能够为您提供一些帮助：

RedirectMatch

这将输出：

with x as (select
'<TABLE>
        <TBODY>
            <TR>
                <TH class="RowTitle">Identificacion</TH>
                <TH class="colRol">Rol</TH>
            </TR>
            <TR class="tData">
                <TD class="RowTitle">
                    <A href="#">4444</A>
                </TD>
                <TD class="colRow" val="INVARGEXT">Investigador</TD>
            </TR>
            <TR class="tData">
                <TD class="RowTitle">
                    <A href="#">55555</A>
                </TD>
                <TD class="colRow" val="BECARIO">Becario</TD>
            </TR>
        </TBODY>
    </TABLE>'::xml as t
),
y as (select unnest(xpath('//TR[@class="tData"]', t)) td from x)
select -- y.td, -- just to debug
xpath('//TD[@class="RowTitle"]/A/text()', y.td),
xpath('//TD/text()', y.td)
from y;

我希望这可以用。

更多信息here和here。

解析PostgreSQL查询中的html字段

1 个答案: