多线RegEx

时间:2012-06-19 10:11:46

标签: php regex

考虑以下文字:

$content=<<<EOT
    {
        "translatorID": "f4a5876a-3e53-40e2-9032-d99a30d7a6fc",
        "label": "ACL",
        "creator": "Nathan Schneider",
        "target": "^https?://(www[.])?aclweb\\.org/anthology-new/[^#]+",
        "minVersion": "1.0.7",
        "maxVersion": "",
        "priority": 100,
        "browserSupport": "gcs",
        "inRepository": true,
        "translatorType": 4,
        "lastUpdated": "2012-01-01 01:42:16"
    }

    // based on ACM translator
    function detectWeb(doc, url) {
      var namespace = doc.documentElement.namespaceURI;
        var nsResolver = namespace ? function(prefix) {
            if (prefix == 'x') return prefix; else return null;
        } : namespace;

        var bibXpath = "//a[./text() = 'bib']"
        if(doc.evaluate(bibXpath, doc, nsResolver, XPathResult.ANY_TYPE, null).iterateNext()) {
          return "multiple"
        }
      //commenting out single stuff
      // if (url.indexOf("/anthology-new/J/")>-1)
      //  return "journalArticle";
      // else
      //  return "conferencePaper";
    }
EOT;

我希望在文字开头的{}之间选择文字。我测试了以下但是它没有产生所需的文本。

preg_match('~\{(.*)\}~m',$content,$meta);
var_dump( $meta);

有什么问题?

4 个答案:

答案 0 :(得分:2)

即使在多线模式下,.也与换行符不匹配。您可以使用s (PCRE_DOTALL) modifier

使其与换行符匹配
preg_match('~\{(.*)\}~sm',$content,$meta);
                      ^

但是你的情况也需要让比赛变得非贪婪,否则这也将从下面的源代码中选择:

preg_match('~\{(.*?)\}~sm',$content,$meta);
                  ^

Demo

答案 1 :(得分:1)

备忘单http://www.cs.washington.edu/education/courses/cse190m/11su/cheat-sheets/php-regex-cheat-sheet.pdf说:

 Base Character Classes
 .  (Period) – Any character except newline

但它也说

Pattern Modifiers
s   Dotall - . class includes newline

答案 2 :(得分:1)

这可能就是你所追求的:

preg_match('/\{(.*?)\}/s', $string, $result);

答案 3 :(得分:0)

尝试

preg_match('~\{(.*)\}~m',$content,$meta,PCRE_MULTILINE);

其他文件 {{3P>