XML XPath忽略Case和Whitespace

时间:2013-05-29 03:26:53

标签: xml xpath php

我已经完成了搜索,但仍然没有清楚的看法。
我获得了保存在本地xml.xml中的XML

<ITEM NAME='Sample'>
   ..some other node here
</ITEM >
<ITEM NAME='SamPlE lorem'>
   ..some other node here
</ITEM >
<ITEM  NAME='Sam Ple lorem ipsum'>
   ..some other node here
</ITEM >
<ITEM  NAME='sample'>
   ..some other node here
</ITEM >
<ITEM  NAME='SAMPLE'>
   ..some other node here
</ITEM >

$xmlfile = 'localhost/project/xml.xml'
$xml = simplexml_load_file($xmlfile);

我需要搜索此字符串"sample",忽略区分大小写和空格,以便我可以在xml以上的每个节点上获得TRUE,到目前为止我只有这个

 //ITEM is not a parent node thats why I am using this line 
 //to lead me to certain part of my xml
 //that match my contain search

 $string = "sample";
 $result = $xml->xpath("//ITEM[contains(@NAME, '$string')");

但是我得到了

的结果
<ITEM  NAME='sample'>
   ..some other node here
</ITEM >

我也尝试在How do i make Xpath search case insensitive中说的翻译功能,但我总是遇到错误。

2 个答案:

答案 0 :(得分:1)

SimpleXML的Xpath不适合完成整个工作。特别是不区分大小写的搜索非常笨拙 - 实际上你遇到的问题太多in the related question

简化工作的一种方法是划分它。例如。首先获取所有有趣元素/属性的列表,然后过滤它们,然后获取它们的所有父元素。

通过将xpath结果(数组)转换为Iterator

,可以轻松完成此操作
$string   = "sample";
$names    = $xml->xpath('//ITEM/@NAME');
$filtered = new LaxStringFilterIterator($names, $string);
$items    = new SimpleXMLParentNodesIterator($filtered);

foreach ($items as $item) {
    echo $item->asXML(), "\n";
}

然后将输出搜索到的节点(示例性):

<ITEM NAME="Sample">
   ..some other node here
</ITEM>
<ITEM NAME="SamPlE lorem">
   ..some other node here
</ITEM>
<ITEM NAME="Sam Ple lorem ipsum">
   ..some other node here
</ITEM>
<ITEM NAME="sample">
   ..some other node here
</ITEM>
<ITEM NAME="SAMPLE">
   ..some other node here
</ITEM>

根据字符串值过滤数组的分离解决方案:

/**
 * Class LaxStringFilterIterator
 *
 * Search for needle in case-insensitive manner on a subject
 * with spaces removed.
 */
class LaxStringFilterIterator extends FilterIterator
{
    private $quoted;

    /**
     * @param Traversable|Array|Object $it
     * @param string $needle
     */
    public function __construct($it, $needle) {
        parent::__construct($it instanceof Traversable ? new IteratorIterator($it) : new ArrayIterator($it));
        $this->quoted = preg_quote($needle);
    }

    public function accept() {
        $pattern = sprintf('/%s/i', $this->quoted);
        $subject = preg_replace('/\s+/', '', trim(parent::current()));
        return preg_match($pattern, $subject);
    }
}

父节点装饰器:

/**
 * Class SimpleXMLParentNodesIterator
 *
 * Return parent nodes instead of current SimpleXMLElement Nodes,
 * for example the element of an attribute.
 */
class SimpleXMLParentNodesIterator extends IteratorIterator
{
    public function current() {
        $current = parent::current();
        list($parent) = $current[0]->xpath('..');
        return $parent;
    }
}

答案 1 :(得分:-1)

如果你想得到每个以'sample'开头而没有处理案例和空格的@Name,你必须使用:

//ITEM[matches(normalize-space(@NAME), '^[sS]\s?[aA]\s?[mM]\s?[pP]\s?[lL]\s?[eE]')]

输出:所有项目