我在抓this页面。我正在访问以下HTML以获取部分详细信息:
<h2>
<span class="mw-headline" id="Volume_one:_Quicksilver_.282003.29">Volume one:
<i>
<a href="https://en.wikipedia.org/wiki/Quicksilver_(novel)"
class="extiw"
title="w:Quicksilver (novel)">Quicksilver</a>
</i> (2003)
</span>
<span class="mw-editsection">
<span class="mw-editsection-bracket">[</span>
<a href="/w/index.php?title=The_Baroque_Cycle&action=edit&section=1"
title="Edit section: Volume one: Quicksilver (2003)">edit</a>
<span class="mw-editsection-bracket">]</span>
</span>
</h2>
我想抓住id
, Volume_one: Quicksilver .282003.29 。为此,我写了以下代码:
$sectionid = '#Volume_one:_Quicksilver_.282003.29';
print($crawler->filter( $sectionid ));
但它并没有返回细节,尽管它在那里。我哪里做错了?它确实很好地获取了#Epilogs
部分。
请帮忙。
答案 0 :(得分:0)
你试过了吗?
print( $crawler->filterXPath('//*[@id='Volume_one:_Quicksilver_.282003.29']') );
我在FirFox浏览器中使用了“Inspect in FirePath”(安装了FireBug)来从该页面获取xpath。