我正在尝试用简单的html dom解析html。
使用此示例
<h3>
<span class="time">19:00
</span>
<a href="/simpsons">The Simpsons</a>
</h3>
<p class="synopsis">Fat Man and Little Boy: When Bart becomes a t-shirt mogul, and the household's main breadwinner, Homer worries that he no longer has a role in the family.
</p>
<a class="link" href="/simpsons/watch">Watch Now</a>
<h3>
<span class="time">20:00</span>
24
</h3>
<p class="synopsis">Emotions run high as the harrowing day climaxes with resolute President Taylor closing in on a world-changing peace treaty.
</p>
<h3>
<span class="time">21:00</span>
<a href="/lost">Lost</a>
</h3>
<p class="synopsis">Pseudo-Locke tries to destroy the island and all of its inhabitants, while Jack attempts to stop him.
</p>
<a class="link" href="/lost/watch">Watch Now</a>
我怎样才能抓住
正如您所看到的,源条目不一致,有时标题会被锚定,并且可能并不总是有“立即观看”链接。
答案 0 :(得分:0)
这看起来像有效的XHTML,因此它也应该是有效的XML。只需像普通的XML一样遍历它。