我不会说英语。所以,如果我犯了一些错误请抱歉。
在网站上我有一个包含游戏信息的div框:
<span class="noteline">Developer:</span>
<span class="subline">Gameloft</span>
<span class="noteline">Genre:</span>
<span class="subline">Racing/Arcade</span>
<span class="noteline">Release year:</span>
<span class="subline">2010</span>
我需要在<span class="noteline">
和结束标记</span>
之间获取信息
preg_match("/\<span\sclass=\"subline\"\>(.*)<\/span\>/imsU", $source, $matches);
上面的解决方案工作正常但它只获得带有文本“gameloft”的“subline”;
但我还需要包含文字Racing / Arcade和2010;
的子行也许这样的事情(这不起作用);
for developer = preg_match("/*(\<span\sclass=\"subline\"\>){1}*(.*)*(<\/span\>){1}*/imsU", $source, $matches);
for genre = preg_match("/*(\<span\sclass=\"subline\"\>){2}*(.*)*(<\/span\>){2}*/imsU", $source, $matches);
像这样......
反正。谢谢你的帮助。
答案 0 :(得分:1)
regexp的替代方法是使用phpQuery或QueryPath,将其简化为:
foreach ( qp($source)->find("span.subline") as $span ) {
print $span->text();
}
答案 1 :(得分:1)
正则表达式不适合解析HTML。他们很难做对,他们总是在边缘情况下打破。
我不知道是否有更简单的方法,但这应该与您描述的标记一起使用:
<?php
$fragment = '<span class="noteline">Developer:</span>
<span class="subline">Gameloft</span>
<span class="noteline">Genre:</span>
<span class="subline">Racing/Arcade</span>
<span class="noteline">Release year:</span>
<span class="subline">2010</span>';
libxml_use_internal_errors(TRUE);
$dom = new DOMDocument();
$dom->loadHTML($fragment);
$xml = simplexml_import_dom($dom);
libxml_use_internal_errors(FALSE);
foreach($xml->xpath("//span[@class='subline']") as $item){
echo (string)$item . PHP_EOL;
}
这假定为class="subline"
,因此它会因多个类而失败。 (Xpath新手,欢迎改进。)
答案 2 :(得分:0)
试试这个:
preg_match_all("/<span class=\"subline\".*span>/", $html, $matches);
preg_match_all("/<span class=\"noteline\".*span>/", $html, $matches);
我用这种方式尝试了上面的代码:
<?php
$html = '<span class="noteline">Developer:</span>
<span class="subline">Gameloft</span>
<span class="noteline">Genre:</span>
<span class="subline">Racing/Arcade</span>
<span class="noteline">Release year:</span>
<span class="subline">2010</span>';
preg_match_all("/<span class=\"subline\".*span>/", $html, $matches1);
preg_match_all("/<span class=\"noteline\".*span>/", $html, $matches2);
print_r($matches1);
echo "<br>";
print_r($matches2);
?>
我得到的输出是:
Array ( [0] => Array ( [0] => Gameloft [1] => Racing/Arcade [2] => 2010 ) )
Array ( [0] => Array ( [0] => Developer: [1] => Genre: [2] => Release year: ) )