我有一组要解析的html项目。我需要解析其类名以'uid-g-uid'结尾的div的内容。以下是示例div ...
<div class="uid-g-uid">1121</div>
<div class="yskisghuid-g-uid">14234</div>
<div class="kif893jduid-g-uid">114235</div>
我尝试了以下组合,但没有工作
$doc = new DOMDocument();
$bdy = 'HTML Content goes here...';
@$doc->loadHTML($bdy);
$xpath = new DomXpath($doc);
$div = $xpath->query('//*[@class=ends-with(., "uid-g-uid")]');
并尝试了
$doc = new DOMDocument();
$bdy = 'HTML Content goes here...';
@$doc->loadHTML($bdy);
$xpath = new DomXpath($doc);
$div = $xpath->query('//*[@class="*uid-g-uid"]');
请帮忙!
答案 0 :(得分:3)
ends-with()需要Xpath 2.0,因此它不适用于Xpath 1.0的DOMXPath。 这样的事情应该有效:
$xpath->query('//*["uid-g-uid" = substring(@class, string-length(@class) - 8)]');
答案 1 :(得分:2)
您想要执行XPath 1.0查询,以检查以特定字符串结尾的字符串。 ends-with()
字符串函数在该版本中不可用。
我可以看到多种方法来做到这一点。在你的情况下,子串总是只在那里一次,如果那么最后你可以使用contains()
:
//*[contains(@class, "uid-g-uid")]
如果子串也可以在那里的某个其他位置而你不喜欢它,那么检查它是否在最后:
//*[contains(@class, "uid-g-uid") and substring-after(@class, "uid-g-uid") = ""]
如果它可以在那里多次,那么这也不会有效。在这种情况下,你可以检查字符串是否与它结束:
//@class[substring(., string-length(.) - 8, 9) = "uid-g-uid"]/..
哪个可能是最直接的变体,或者,因为substring()
的第三个参数是可选的比较直到结束:
//@class[substring(., string-length(.) - 8) = "uid-g-uid"]/..
答案 2 :(得分:2)
由于您正在寻找XPath 1.0中不可用的XPath函数,我认为您可以使用PHP提供的DOMXPath::registerPhpFunctions功能来为您的XPath查询调用任何PHP函数。有了这个,您甚至可以像这样调用preg_match
函数:
$html = <<< EOF
<div class="uid-g-uid">1121</div>
<div class="yskisghuid-g-uid">14234</div>
<div class="kif893jduid-g-uid">114235</div>
EOF;
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your html
$xpath = new DOMXPath($doc);
// Register the php: namespace (required)
$xpath->registerNamespace("php", "http://php.net/xpath");
// Register PHP preg_match function
$xpath->registerPHPFunctions('preg_match');
// call PHP preg_match function on your xpath to make sure class ends
// with the string "uid-g-uid" using regex "/uid-g-uid$/"
$nlist = $xpath->evaluate('//div[php:functionString("preg_match",
"/uid-g-uid$/", @class) = 1]/text()');
$numnodes = $nlist->length; // no of divs matched
for($i=0; $i < $numnodes; $i++) { // run the loop on matched divs
$node = $nlist->item($i);
echo "val: " . $node->nodeValue . "\n";
}
答案 3 :(得分:1)
试试这个:
#/ First regex and replace your class with findable flag
$bdy = preg_replace('/class=\".*?uid-g-uid\"/ims', 'class="__FINDME__"', $bdy);
#/ Now find the new flag name instead
$dom = new DOMDocument();
@$dom->loadHTML($bdy);
$xpath = new DOMXPath($dom);
$divs = $xpath->evaluate("//div[@class = '__FINDME__']");
var_dump($divs->length); die(); //check if length is >=1. else we have issue.
for($j=0; $j<$divs->length; $j++)
{
$div = $divs->item($j);
$div_value = $div->nodeValue;
.
.
.
}