我遇到了问题,我有这个html源代码:
<td class="specs_title">
Processortype
<a href="#" class="info-link">
<img src="x.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Intel Core i3
</td>
<td class="specs_title">
Spec
<a href="#" class="info-link">
<img src="y.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Other Spec
</td>
我必须通过php和XPath从这个页面中获取“Intel Core i3”,我想通过查询文本Processortype并使用它执行某些操作的查询来实现。 这是否可能,如果是这样如何? 谢谢你的回复!
答案 0 :(得分:1)
一种方法是使用Symfony's DomCrawler component。
use Symfony\Component\DomCrawler\Crawler;
$html = <<<EOF
<td class="specs_title">
Processortype
<a href="#" class="info-link">
<img src="x.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Intel Core i3
</td>
<td class="specs_title">
Spec
<a href="#" class="info-link">
<img src="y.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Other Spec
</td>
EOF;
$crawler = new Crawler();
$crawler->addContent($html);
$nodes = $crawler->filterXPath("//td[@class='specs_descr']");
echo $nodes->first()->text(); //This prints exactly "Intel Core i3"
答案 1 :(得分:0)
$XML = '
<root>
<td class="specs_title">
Processortype
<a href="#" class="info-link">
<img src="x.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Intel Core i3
</td>
<td class="specs_title">
Spec
<a href="#" class="info-link">
<img src="y.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Other Spec
</td>
</root>';
$sxe = new SimpleXMLElement($XML);
var_dump(array_map('strval',$sxe->xpath("
//td[@class='specs_title' and contains(.,'Processortype')]
/following-sibling::td[@class='specs_descr'][1]")));
输出:
array(2) {
[0] =>
string(27) "
Intel Core i3
"
[1] =>
string(24) "
Other Spec
"
}