PHP preg_match返回两个字符串之间的字符串

时间:2015-10-27 17:00:18

标签: php html-parsing preg-match

如何获得此链接" http://example.com/view.php?id=5841"来自代码:

<h3 class="coursename"><a class="" href="http://example.com/view.php?id=521">D<span class="highlight">LAW</span> <span class="highlight">130</span>Management</a></h3><div class="moreinfo"></div></div><div class="content"><ul class="teachers"><li>Teacher: <a href="http://example.com/">John</a></li></ul><div class="coursecat">Category: <a class="" href="http://example.com/">First</a></div></div></div><div class="coursebox clearfix even" data-courseid="5841" data-type="1"><div class="info"><h3 class="coursename"><a class="" href="http://example.com/view.php?id=5841"><span class="highlight">LAW</span> <span class="highlight">130`

我试过了:

preg_match('/href="(.*)"><span class="highlight">LAW/isU',$BBB,$AAA);

结果是:

http://example.com/view.php?id=521">D<span class="highlight">LAW</span> <span class="highlight">130</span>Management</a></h3><div class="moreinfo"></div></div><div class="content"><ul class="teachers"><li>Teacher: <a href="http://example.com/">John</a></li></ul><div class="coursecat">Category: <a class="" href="http://example.com/">First</a></div></div></div><div class="coursebox clearfix even" data-courseid="5841" data-type="1"><div class="info"><h3 class="coursename"><a class="" href="http://example.com/view.php?id=5841

2 个答案:

答案 0 :(得分:0)

请改用:

/href="(.[^<]*?)"><span class="highlight">LAW/isU

这是一种告诉Regex找到与您想要的最匹配表达式的简单方法。

答案 1 :(得分:0)

使用XPath查询:

libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($yourHTML);

$xp = new DOMXPath($dom);
$link = $xp->query('//a[span[@class="highlight"]][starts-with(.,"LAW")][1]/@href')->item(0)->nodeValue;

echo $link;

查询详情:

// # axe: anywhere in the DOM tree
a  # axe: a "a" tag
[span[@class="highlight"]] # predicate: the "a" tag has for direct child a "span" tag 
                           # with a "class" attribute equal to "highlight"  
[starts-with(.,"LAW")]     # predicate: its text content begins with "LAW"
[1]                        # predicate: first occurrence (no need to search another one)
/@href                     # axe: its "href" attribute