使用Xpath进行部分匹配

时间:2019-04-18 17:12:01

标签: php xml xpath

我正在尝试创建一个搜索功能,以允许使用Xpath按歌曲标题或流派进行部分匹配。

这是我的XML文件:

<?xml version="1.0" encoding="UTF-8"?>
<playlist>
  <item>
    <songid>USAT29902236</songid>
    <songtitle>I Say a Little Prayer</songtitle>
    <artist>Aretha Franklin</artist>
    <genre>Soul</genre>
    <link>https://www.amazon.com/I-Say-a-Little-Prayer/dp/B001BZD6KO</link>
    <releaseyear>1968</releaseyear>
  </item>
  <item>
    <songid>GBAAM8300001</songid>
    <songtitle>Every Breath You Take</songtitle>
    <artist>The Police</artist>
    <genre>Pop/Rock</genre>
    <link>https://www.amazon.com/Every-Breath-You-Take-Police/dp/B000008JI6</link>
    <releaseyear>1983</releaseyear>
  </item>
  <item>
    <songid>GBBBN7902002</songid>
    <songtitle>London Calling</songtitle>
    <artist>The Clash</artist>
    <genre>Post-punk</genre>
    <link>https://www.amazon.com/London-Calling-Remastered/dp/B00EQRJNTM</link>
    <releaseyear>1979</releaseyear>
  </item>
</playlist>

这是我到目前为止的搜索功能:

function searchSong($words){
    global $xml;

    if(!empty($words)){
        foreach($words as $word){
            //$query = "//playlist/item[contains(songtitle/genre, '{$word}')]";
            $query = "//playlist/item[(songtitle[contains('{$word}')]) and (genre[contains('{$word}')])]";
            $result = $xml->xpath($query);
        }
    }

    print_r($result);
}

调用函数searchSong(array("take", "soul"))应该返回XML文件中的第二首和第一首歌曲,但是数组始终为空。

2 个答案:

答案 0 :(得分:2)

这里有一些错误:假设搜索不区分大小写,并且使用and而不是or,并且将错误数量的参数传递给contains。如果您正在寻找最后一个,则会触发PHP警告。另外,您只会返回搜索到的最后一项。

Case insensitive searches in XPath 1.0(所有PHP支持)非常麻烦:

$result = $xml->query(
    "//playlist/item[(songtitle[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '{$word}')]) or (genre[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '{$word}')])]"
);

这假设您已使用搜索词并将其转换为小写。例如:

<?php

function searchSong($xpath, ...$words)
{
    $return = [];
    foreach($words as $word) {
        $word = strtolower($word);
        $q = "//playlist/item[(songtitle[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '{$word}')]) or (genre[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '{$word}')])]";
        $result = $xpath->query($q);
        foreach($result as $node) {
            $return[] = $node;
        }
    }
    return $return;
}

答案 1 :(得分:1)

在DOM中,您还有另一个选择,您可以注册PHP函数并在Xpath表达式中使用它们。

因此编写一个执行匹配逻辑的函数:

function contentContains($nodes, ...$needles) {
    // ICUs transliterator is really convenient, 
    // lets get one for lowercase and replacing umlauts 
    $transliterator = \Transliterator::create('Any-Lower; Latin-ASCII');
    foreach ($nodes as $node) {
        $haystack = $transliterator->transliterate($node->nodeValue);
        foreach ($needles as $needle) {
            if (FALSE !== strpos($haystack, $needle)) {
                return TRUE;
            }
        }
    }
    return FALSE;
}

现在您可以在DOMXpath实例上注册它:

$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions(['contentContains']);

$expression = "//item[
    php:function('contentContains', songtitle, 'take', 'soul') or
    php:function('contentContains', genre, 'take', 'soul') 
]";

$result = [];
foreach ($xpath->evaluate($expression) as $node) {
    // read values as strings
    $result[] = [
        'title' => $xpath->evaluate('string(songtitle)', $node),
        'gerne' => $xpath->evaluate('string(genre)', $node),
        // ...
    ];
}
var_dump($result);