PHP REGEX:根据innerHTML查找dom节点

时间:2010-10-21 02:34:04

标签: php regex dom

我很清楚PHPDom可以解决我问题的一半,我需要一种方法(不一定是正则表达式)能够根据给定的innerHTML找到某个DOM元素。

比方说,我得到了这段代码:

<tr>
  <td class="ranking_rank" style="vertical-align:middle;">48697</td>
  <td class="ranking_ign" style="vertical-align:middle;">kanineh</td>
  <td class="ranking_img" style="vertical-align:middle;">
    <img src="http://avatar.maplesea.com/Character/NKGEHGDLFNINKPMFLDCNNOHKHKBOHBKLGCBLABFLABHAGBPAEMDEFABJBLKJIHJAANGEKFJGELEPKMCNLKPCINEJDGAJFLKG.gif" onerror="this.src='/images/ranking/noimage.jpg'"/>
  </td>
  <td class="ranking_lvl" style="vertical-align:middle;">122</td>
  <td class="ranking_world" style="vertical-align:middle;">
    <img src="/images/ranking/Bootes.gif" onMouseover="ddrivetip('Bootes','white', 70)" onMouseout="hidetip()">
  </td>
  <td class="ranking_job" style="vertical-align:middle;">
    <img src="/images/ranking/Warrior.gif" onMouseover="ddrivetip('Warrior','white', 70)" onMouseout="hidetip()">
  </td>
  <td class="ranking_fame" style="vertical-align:middle;">449</td>
</tr>
<tr>
  <td class="ranking_rank" style="vertical-align:middle;">48698</td>
  <td class="ranking_ign" style="vertical-align:middle;">WannaLogic</td>
  <td class="ranking_img" style="vertical-align:middle;">
    <img src="http://avatar.maplesea.com/Character/DOMELFGEGCGDBFCOLADBDOJLHADCIBNKEGKGINPNBEKPDDKOEEGBLMDLBGBDHGCNPGLAECAMLGKEMDKJGPODIDKCOJCMNNKN.gif" onerror="this.src='/images/ranking/noimage.jpg'"/>
  </td>
  <td class="ranking_lvl" style="vertical-align:middle;">122</td>
  <td class="ranking_world" style="vertical-align:middle;">
    <img src="/images/ranking/Aquila.gif" onMouseover="ddrivetip('Aquila','white', 70)" onMouseout="hidetip()">
  </td>
  <td class="ranking_job" style="vertical-align:middle;">
    <img src="/images/ranking/Magician.gif" onMouseover="ddrivetip('Magician','white', 70)" onMouseout="hidetip()">
  </td>
  <td class="ranking_fame" style="vertical-align:middle;">56</td>
</tr>

我需要能够使用包含WannaLogic的td来保存整个行节点。这样,当我已经有这个表行时,我现在可以使用PHP DOM轻松遍历节点。我是一个正常表达的傻瓜,所以如果你能对我有所了解,我真的很感激。

1 个答案:

答案 0 :(得分:0)

在面对格式错误的XML / HTML时,在DOM树上使用正则表达式是禁忌并且必然会失败。试试这个:

$xpath = new DOMXPath($doc);
$query = "//*[.='WannaLogic']";
$entries = $xpath->query($query);

foreach ($entries as $entry) {
    // do whatever
}