preg_match_all,regex和/是麻烦

时间:2013-11-27 06:18:25

标签: php regex preg-match preg-match-all

我正在使用PHP中的preg_match_all对HTML页面进行webscraping。这就是我要抓的东西:

  <script>
    function fsb38(x) {
      var b=new 
      Array(98,100,97,98,98,98,99,50,51,55,53,50,48,100,57,98,50,100,53,100,97,48,100,52,100,57,97,56,97,51,54,99,56,38,104,52,61,53,98,99,54,102,57,55,49,99,55,101,55,61,101,48,98,55,99,57,102,110,56,57,102,98,111,78,54,102,102,109,114,53,111,54,101,102,48,48,38,54,98,61,116,50,97,99,38,56,101,51,57,49,102,61,100,101,105,106,101,63,101,101,57,48,52,112,104,112,46,115,110,111,105,115,115,105,109);
      var p=new Array(0,0,0,0,1,1,1,0,0,1,0,0,1,1,0,0,1,1,1,0,1,0,1,0,1,0,1,0,1,1,0,0,0,0,0,1,0,0,1,1,0,0,1,1,1,0,0,0,1,1,1,0,0,0,1,0,0,1,0,0,0,0,1,1,0,0,0,1,1,0,1,0,0,1,0,0,1,1,0,1,1,0,1,1,1,1,0,1,0,0,0,1,1,0,1,1,0,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1);       
      window.location = c(b,p) + x;
      return false;
    }
  </script>

通常preg_match_all('/var b=new(.*)var p=new/is', $output, $ar);可以完美地运作。但是,由于在整个页面中多次出现这种情况,它只显示我匹配:我告诉它从哪里开始,到var p=new的最后一次出现。

我尝试过这样做:preg_match_all('/var b=new(.*)(\n)(\s)var p=new/is', $output, $ar); - 但是当我使用它时,我什么都没有回来。我做错了什么?

2 个答案:

答案 0 :(得分:2)

如果你想得到所有的Array()

,请使用它
preg_match_all('/var.*?=new(.*?)\)\;/is', $output, $ar);

如果您只想获得b = new Array()

,请使用此选项
preg_match_all('/var b=new(.*?)\)\;/is', $output, $ar);

答案 1 :(得分:1)

正则表达式是“贪婪的” - 部分.*匹配最长的字符串。 您需要“ungreedy”行为 - 使用U修饰符。

http://php.net/manual/en/reference.pcre.pattern.modifiers.php