如何在xpath中通过文本查找内容?

时间:2013-10-15 18:08:16

标签: php html text xpath find

我遇到了问题,我有这个html源代码:

<td class="specs_title">
Processortype
<a href="#" class="info-link">
<img src="x.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Intel Core i3
</td>
<td class="specs_title">
Spec
<a href="#" class="info-link">
<img src="y.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Other Spec
</td>

我必须通过php和XPath从这个页面中获取“Intel Core i3”,我想通过查询文本Processortype并使用它执行某些操作的查询来实现。 这是否可能,如果是这样如何? 谢谢你的回复!

2 个答案:

答案 0 :(得分:1)

一种方法是使用Symfony's DomCrawler component

use Symfony\Component\DomCrawler\Crawler;

$html = <<<EOF
<td class="specs_title">
    Processortype
    <a href="#" class="info-link">
        <img src="x.jpg" title="" height="16" alt="" width="16" />
        <span class="info-popup">
            <span class="hd">Processortype</span>
            <span class="bd">Text</span>
        </span>
    </a>
</td>
<td class="specs_descr">
    Intel Core i3
</td>
<td class="specs_title">
    Spec
    <a href="#" class="info-link">
        <img src="y.jpg" title="" height="16" alt="" width="16" />
        <span class="info-popup">
            <span class="hd">Processortype</span>
            <span class="bd">Text</span>
        </span>
    </a>
</td>
<td class="specs_descr">
    Other Spec
</td>
EOF;

$crawler = new Crawler();
$crawler->addContent($html);
$nodes = $crawler->filterXPath("//td[@class='specs_descr']");
echo $nodes->first()->text(); //This prints exactly "Intel Core i3"

答案 1 :(得分:0)

$XML = '
<root>
    <td class="specs_title">
        Processortype
        <a href="#" class="info-link">
            <img src="x.jpg" title="" height="16" alt="" width="16" />
            <span class="info-popup">
                <span class="hd">Processortype</span>
                <span class="bd">Text</span>
            </span>
        </a>
    </td>
    <td class="specs_descr">
        Intel Core i3
    </td>
    <td class="specs_title">
        Spec
        <a href="#" class="info-link">
            <img src="y.jpg" title="" height="16" alt="" width="16" />
            <span class="info-popup">
                <span class="hd">Processortype</span>
                <span class="bd">Text</span>
            </span>
        </a>
    </td>
    <td class="specs_descr">
        Other Spec
    </td>
</root>';
$sxe = new SimpleXMLElement($XML);
var_dump(array_map('strval',$sxe->xpath("
    //td[@class='specs_title' and contains(.,'Processortype')]
    /following-sibling::td[@class='specs_descr'][1]")));

输出:

array(2) {
  [0] =>
  string(27) "
        Intel Core i3
    "
  [1] =>
  string(24) "
        Other Spec
    "
}