xpath以非ascii字符开头(php domdocument)

时间:2014-01-27 18:03:35

标签: php xml xpath domdocument

<?php
$dom = new DOMDocument();

$dom -> loadXML('<?xml version="1.0" encoding="UTF-8" standalone="yes"?><sst><si><t>andy</t>    </si><si><t>billy</t></si><si><t>中文</t></si></sst>');

$xpath = new DomXPath($dom);
$entities = $xpath -> query("//t[starts-with(.,'a')]/text()");
foreach ($entities as $entity) {
    echo $entity -> nodeValue;
}
?>

在上面的例子中,我想查询值以特定字母开头的所有t元素。我可以选择以A-Z开头的单词,但我不知道如何处理那些以非ascii字符开头的情况。

    $entities = $xpath -> query("//t[not (starts-with(.,'a') or starts-with(.,'b'))]/text()");

除了上述方法(可能需要26次开始 - (。,'a')..... xyz),还有更聪明的方法吗?

由于

1 个答案:

答案 0 :(得分:1)

获取当前元素中的第一个字符

substring(.,1,1)

a

替换字符列表

translate(substring(.,1,1),'abcdefghijklmnopqrstuvwxyz','aaaaaaaaaaaaaaaaaaaaaaaaaa')

验证结果不是a

translate(substring(.,1,1),'abcdefghijklmnopqrstuvwxyz','aaaaaaaaaaaaaaaaaaaaaaaaaa') != 'a'

完整示例:

$dom = new DOMDocument();
$dom ->loadXML('<?xml version="1.0" encoding="UTF-8" standalone="yes"?><sst><si><t>andy</t>    </si><si><t>billy</t></si><si><t>中文</t></si></sst>');

$xpath = new DomXPath($dom);

$entities = $xpath->evaluate(
  "//t[translate(substring(.,1,1), 'abcdefghijklmnopqrstuvwxyz','aaaaaaaaaaaaaaaaaaaaaaaaaa') != 'a']/text()"
);
foreach ($entities as $entity) {
  echo $entity->nodeValue;
}

输出:

中文