我一直很困惑。所以这是我的问题,我有这样的文字:
<ORGANIZATION>Head of Pekalongan Regency</ORGANIZATION>, Dra. Hj.. Siti Qomariyah , MA and her staff were greeted by <ORGANIZATION>Rector of IPB</ORGANIZATION> Prof. Dr. Ir. H. Herry Suhardiyanto , M.Sc. and <ORGANIZATION>officials of IPB</ORGANIZATION> in the guest room.
我尝试使用我的代码在<ORGANIZATION>
标记内获取值:
function get_text_between_tags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
if(!empty($matches[1]))
return $matches[1];
}
但是,当有3个标记officials of IPB
时,此代码仅从最后一个标记(<ORGANIZATION>
)中检索一个值。
现在,我无意修改此代码以获取标记内的所有值而无需重复。所以请提前帮助,谢谢。 :D
答案 0 :(得分:2)
preg_match
只返回第一个匹配项,如果出现以下情况,您当前的代码将失败:
相反,试试这个:
function get_text_between_tags($string, $tagname) {
$pattern = "/<$tagname\b[^>]*>(.*?)<\/$tagname>/is";
preg_match_all($pattern, $string, $matches);
if(!empty($matches[1]))
return $matches[1];
return array();
}
这是正则表达式用于解析的可接受的用法,因为它是一个明确定义的情况。但请注意,如果由于某种原因,标记的属性值中存在>
,它将失败。
如果您想避免the wrath of the pony,请尝试以下操作:
function get_text_between_tags($string, $tagname) {
$dom = new DOMDocument();
$dom->loadHTML($string);
$tags = $dom->getElementsByTagName($tagname);
$out = array();
$length = $tags->length;
for( $i=0; $i<$length; $i++) $out[] = $tags->item($i)->nodeValue;
return $out;
}
答案 1 :(得分:-2)
您是否尝试过strip_tags()
功能?
<?php
$s = "<ORGANIZATION>Head of Pekalongan Regency</ORGANIZATION>, Dra. Hj.. Siti Qomariyah , MA and her staff were greeted by <ORGANIZATION>Rector of IPB</ORGANIZATION> Prof. Dr. Ir. H. Herry Suhardiyanto , M.Sc. and <ORGANIZATION>officials of IPB</ORGANIZATION> in the guest room.";
$r = strip_tags($s);
var_dump($r);
?>