DOM XPath查询失败

时间:2016-01-19 01:37:40

标签: php xml dom xpath

我有这个代码示例(完整代码 here):

$raw = " ( xml data ) ";

$nsURIs = array
(
  'opf'     => 'http://www.idpf.org/2007/opf',
  'dc'      => 'http://purl.org/dc/elements/1.1/',
  'dcterms' => 'http://purl.org/dc/terms/',
  'xsi'     => 'http://www.w3.org/2001/XMLSchema-instance',
  'ncx'     => 'http://www.daisy.org/z3986/2005/ncx/',
  'calibre' => 'http://calibre.kovidgoyal.net/2009/metadata'
);

$dom   = new \DOMDocument();
$dom->loadXML( $raw,LIBXML_NOBLANKS );

$metadata = $dom->getElementsByTagName( 'metadata' )->item(0);         #metadata

$xpath = new \DOMXPath( $dom );
foreach( $nsURIs as $key => $ns ) $xpath->registerNamespace( $key, $ns );

$query = array();
$query[] = '//dc:identifier';                                          #00
$query[] = '//dc:identifier[. = "9780439554930"]';                     #01
$query[] = '//dc:identifier[@opf:scheme="ISBN"]';                      #02
$query[] = '//dc:identifier[starts-with(@opf:scheme,"I")]';            #03
$query[] = '//dc:identifier[contains(@opf:scheme,"SB")]';              #04
$query[] = '//dc:identifier[ends-with(@opf:scheme,"N")]';              #05 Unregistered
$query[] = '//dc:identifier["IS" = substring(@opf:scheme, 0, 2)]';     #06 Fails
$query[] = '//dc:identifier[contains(.,"439")]';                       #07
$query[] = '//dc:identifier[@*="ISBN"]';                               #08
$query[] = '//dc:date[contains(@*, "cation")]';                        #09
$query[] = '//dc:*[contains(@opf:*, "and")]';                          #10 Wrong Result
$query[] = '//dc:*[contains(@opf:file-as, "and")]';                    #11
$query[] = '//dc:*[contains(@opf:*, "ill")]';                          #12
$query[] = '//dc:contributor[@opf:role and @opf:file-as]';             #13
$query[] = '//dc:subject[contains(.,"anta") and contains(.,"Urban")]'; #14
$query[] = '//dc:subject[text() = "Fantasy"]';                         #15

for( $i=0; $i<count($query); $i++ )
{
  $result = $xpath->evaluate( $query[$i] );
  echo sprintf( "[%02d]  % 2d  %s\n", $i, $result->length, $query[$i] );
}

查询#5 由于未注册的功能而失败;查询#6 失败(0结果为1),查询#10 生成1项而不是2项(在以下查询中正确生成#11 )。 在$metadata上下文中执行查询的结果相同。

this question中,我找到了未注册ends-with的替代方法:

$query[] = '//dc:identifier["N" = substring(@opf:scheme, string-length(@opf:scheme) - string-length("N"))]';

但即使这个黑客也失败了......

有人有建议或替代方案吗?

1 个答案:

答案 0 :(得分:1)

关于#06 Fails

//dc:identifier["IS" = substring(@opf:scheme, 0, 2)]

说明:

XPath索引从1而不是0开始,因此subsstring()的正确参数如下:

//dc:identifier["IS" = substring(@opf:scheme, 1, 2)]

关于#10 Wrong Result

//dc:*[contains(@opf:*, "and")]

说明:

在XPath 1.0中将多个值作为函数参数传递时,只会计算第一个值。因此,在这种情况下,只会评估前缀为opf的第一个属性,因此以下元素不计算:

<dc:contributor opf:role="ill" opf:file-as="GrandPré, Mary">Mary GrandPré</dc:contributor>

要避免此问题,您应该将XPath更改为:

//dc:*[@opf:*[contains(., "and")]]