php - 如何在HTML代码块中提取单行 - Thinbug

如何在HTML代码块中提取单行

时间：2014-10-08 04:48:29

标签： php html regex

我的内容为：

<meta property="og:type" content="article" />
<meta property="og:url" content="http://website/fox/" />
<meta property="og:site_name" content="The Fox" />
<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209" />
<meta property="og:title" content="Fox goes to forest" />

我的要求是提取/获取一行，即meta property=og:image..，因此结果应包含：

<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209" />

2 个答案:

答案 0 :(得分：1)

^<meta property="og:image".*$

试试这个。标记m和g。请参阅演示。

http://regex101.com/r/hQ1rP0/48

答案 1 :(得分：1)

提取HTML的“行”或使用正则表达式来解析HTML一般都很脆弱。更强大的是使用HTML解析器，例如DOM extension提供的支持。

实施例

$html = <<<'HTML'
<meta property="og:type" content="article" />
<meta property="og:url" content="http://website/fox/" />
<meta property="og:site_name" content="The Fox" />
<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209" />
<meta property="og:title" content="Fox goes to forest" />
HTML;

$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$nodes = $xpath->query('//meta[@property="og:image"]');

foreach ($nodes as $node) {
    echo $dom->saveHTML($node);
}

输出：

<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209">