PHP:正则表达式匹配短语不包含特定单词

时间:2011-06-25 12:59:35

标签: php regex

我已经厌倦了寻找一个足够接近的例子,时间可以获得一些快速帮助!这是我的代码:

preg_match_all( '#<li.*?>.*?</li>#s', $card_html, $activity );

我想对其进行修改,以便<li.*?>排除单词Unplayed。 (这个词出现在.*?之后和>之前。

修改

Want to catch: http://gamercard.xbox.com/en-US/Stallion83.card

            <li >

                <a href="http://live.xbox.com/en-us/GameCenter/Achievements?title=1464993792&amp;compareTo=Stallion83">
                   <img src="http://tiles.xbox.com/tiles/vD/fP/1Gdsb2JhbA9ECgUPGgIfVl9TL2ljb24vMC84MDAwIAAAAAAAAPvgN6M=.jpg" alt="F.E.A.R. 3" title="F.E.A.R. 3" />
                   <span class="Title">F.E.A.R. 3</span>
                   <span class="LastPlayed">6/24/2011</span>
                   <span class="EarnedGamerscore">415</span>
                   <span class="AvailableGamerscore">1000</span>
                   <span class="EarnedAchievements">23</span>
                   <span class="AvailableAchievements">50</span>
                   <span class="PercentageComplete">46%</span>
                </a>
            </li>

            <li class="Complete" >

                <a href="http://live.xbox.com/en-US/GameCenter/Achievements?title=1096157212&amp;compareTo=Im%20RedJ">
                   <img src="http://tiles.xbox.com/tiles/HI/L4/1Gdsb2JhbA9ECgQJGgYfVl4gL2ljb24vMC84MDAwIAAAAAAAAPvXggM=.jpg" alt="Call of Duty: WaW" title="Call of Duty: WaW" />
                   <span class="Title">Call of Duty: WaW</span>
                   <span class="LastPlayed">6/21/2011</span>
                   <span class="EarnedGamerscore">1500</span>
                   <span class="AvailableGamerscore">1500</span>
                   <span class="EarnedAchievements">66</span>
                   <span class="AvailableAchievements">66</span>
                   <span class="PercentageComplete">100%</span>
                </a>
            </li>

Dont want: http://gamercard.xbox.com/en-US/test.card

            <li class="Unplayed"></li>

            <li class="Unplayed"></li>

感谢。

2 个答案:

答案 0 :(得分:5)

试试这个。前瞻确保<li>标记的其余部分不包含未播放。

preg_match_all( '#<li(?=(.(?!Unplayed))*?>).*?>(.(?!Unplayed))*?</li>#s', $card_html, $activity ); 

编辑:我无法从您的示例中看出来,但听起来好像未播放可能会在&lt; li&gt;的开始标记内外发生。

答案 1 :(得分:1)

正则表达式不适合执行此任务,您可以在此处看到有关SO的许多解释。像这样使用DOMDocument:

function innerHTML($node){
  $doc = new DOMDocument();
  foreach ($node->childNodes as $child)
    $doc->appendChild($doc->importNode($child, true));   
  return $doc->saveHTML();
}
$dom = new DOMDocument();
$dom->loadHTML($content);
// To hold all your liv...
$lis = array();
// Get all li nodes
$liNodes = $dom->getElementsByTagName("li");
foreach($liNodes as $liNode) {
  // Check the class attr of each li
  $cl = $liNode->getAttribute("class");
  if ($cl != "Unplayed")
       $lis[] = innerHTML($liNode);
}
print_r($lis);

以上输入html的输出:

Array
(
    [0] => 
                <a href="http://live.xbox.com/en-us/GameCenter/Achievements?title=1464993792&amp;compareTo=Stallion83">
                   <img src="http://tiles.xbox.com/tiles/vD/fP/1Gdsb2JhbA9ECgUPGgIfVl9TL2ljb24vMC84MDAwIAAAAAAAAPvgN6M=.jpg" alt="F.E.A.R. 3" title="F.E.A.R. 3"><span class="Title">F.E.A.R. 3</span>
                   <span class="LastPlayed">6/24/2011</span>
                   <span class="EarnedGamerscore">415</span>
                   <span class="AvailableGamerscore">1000</span>
                   <span class="EarnedAchievements">23</span>
                   <span class="AvailableAchievements">50</span>
                   <span class="PercentageComplete">46%</span>
                </a>            

    [1] => 
                <a href="http://live.xbox.com/en-US/GameCenter/Achievements?title=1096157212&amp;compareTo=Im%20RedJ">
                   <img src="http://tiles.xbox.com/tiles/HI/L4/1Gdsb2JhbA9ECgQJGgYfVl4gL2ljb24vMC84MDAwIAAAAAAAAPvXggM=.jpg" alt="Call of Duty: WaW" title="Call of Duty: WaW"><span class="Title">Call of Duty: WaW</span>
                   <span class="LastPlayed">6/21/2011</span>
                   <span class="EarnedGamerscore">1500</span>
                   <span class="AvailableGamerscore">1500</span>
                   <span class="EarnedAchievements">66</span>
                   <span class="AvailableAchievements">66</span>
                   <span class="PercentageComplete">100%</span>
                </a> 
)