我可以使用xpath从内容字典中获取价值吗?

时间:2019-06-10 10:17:53

标签: html dictionary xpath

这是我要从中获取pub_date的元标记的示例:

<meta name="parsely-page" content='{"title":"Article title","link":"https:\/\/site.com\/category\/article","type":"post","section":"category","image_url":"","author":null,"pub_date":"2009-03-01T14:17:14+00:00","post_id":"article_6463676334","tags":[]}' />

获取全部内容的xpath将是:

//meta[@name="parsely-author"]/@content

是否可以使用xpath获取dict键的值?

2 个答案:

答案 0 :(得分:0)

使用XPath 3.1,您可以

//meta[@name="parsely-author"]/parse-json(@content)?pub-date

可悲的是,很可能您正在使用仅支持XPath 1.0的XPath处理器,在这种情况下,除非找到其他处理器,否则您将无法使用此处理器。

答案 1 :(得分:0)

使用XSLT 1.0

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:variable name="vQ">"</xsl:variable>
  <xsl:template match="/">
    <xsl:value-of select=
    'substring-before(substring-after(//meta[@name="parsely-page"]/@content,
                                      concat($vQ, "pub_date", $vQ, ":", $vQ)), $vQ)'/>
  </xsl:template>
</xsl:stylesheet>

在此XML文档上执行此转换(您的元标记):

<meta name="parsely-page"
content='{"title":"Article title","link":"https:\/\/site.com\/category\/article","type":"post","section":"category","image_url":"","author":null,"pub_date":"2009-03-01T14:17:14+00:00","post_id":"article_6463676334","tags":[]}' />

产生了想要的结果

2009-03-01T14:17:14 + 00:00

我们可以编写一个计算为所需字符串的XPath 1.0表达式,但是,如果不进行转义,则必须对引号和撇号进行转义以避免嵌套错误。

substring-before(substring-after(//meta[@name="parsely-page"]/@content, 
                                 &apos;&quot;pub_date&quot;:&quot;&apos;), 
                 &apos;&quot;&apos;)

使用XSLT 1.0进行验证

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:variable name="vQ">"</xsl:variable>
  <xsl:template match="/">
    <xsl:value-of select=
    'substring-before(substring-after(//meta[@name="parsely-page"]/@content,
                                      &apos;&quot;pub_date&quot;:&quot;&apos;), 
                      &apos;&quot;&apos;)'/>
  </xsl:template>
</xsl:stylesheet>

将此转换应用于相同的XML文档(如上所述)时,它将评估单个XPath 1.0表达式并输出所需的正确结果:

2009-03-01T14:17:14 + 00:00