我正在尝试编写preg_match
以匹配给定的字符串而我无法正确使用。
我想要匹配的是长文本中的这部分:
Adrian asdzxc
我尝试的规则看起来像这样,但它失败了:
preg_match('#<div class="instant_search_title fsl fwb fcb"><a href="(?:[^"]+)">([^<]+)</a>#i', $file_contents, $matches);
print_r($matches);
长篇文章是这样的:
<i class="mrs friendsIcon customimg img sp_b6fmvb sx_5e511f"><u>Friend</u></i><span class="uiButtonText">Friends</span></a></div></div><div class="pls"><div class="instant_search_title fsl fwb fcb"><a href="https://www.facebook.com/adrianasdzxc" onclick="if (event.button == 0) { search_logged_ajax({"ab":"T_GS_BACKEND_ENTITY_LIMITATION","cururl":"https:\\/\\/www.facebook.com\\/adrianasdzxc","fc":1,"gc":0,"id":616226XXX,"init":"s:unknown","is_friend":true,"is_new_user":0,"locale":"en_US","o_type":1,"original_q":"adriantnt\\u0040yahoo.com","q":"adriantnt\\u0040yahoo.com","rank":0,"rc":0,"sid":"10000108681XXXX.3365508868..1","start":0,"typeahead_sid":null,"u":"https:\\/\\/www.facebook.com\\/adrianasdzxc","t":"c:name"}); }" onmouseup="if (event.button != 0) { search_logged_ajax({"ab":"T_GS_BACKEND_ENTITY_LIMITATION","cururl":"https:\\/\\/www.facebook.com\\/adrianasdzxc","fc":1,"gc":0,"id":616226XXX,"init":"s:unknown","is_friend":true,"is_new_user":0,"locale":"en_US","o_type":1,"original_q":"adriantnt\\u0040yahoo.com","q":"adriantnt\\u0040yahoo.com","rank":0,"rc":0,"sid":"10000108681XXXX.3365508868..1","start":0,"typeahead_sid":null,"u":"https:\\/\\/www.facebook.com\\/adrianasdzxc","t":"c:name"}); }" data-hovercard="/ajax/hovercard/user.php?id=616226XXX">Adrian asdzxc</a></div><div class="fsm fwn fcg"><div class="fbProfileByline searchResultPersonByline"><span class="fbProfileBylineFragment"><span class="fbProfileBylineIconContainer"><i class="mrs fbProfileBylineIcon img sp_9pvis2 sx_897cc1"></i></span><span class="fbProfileBylineLabel">Webmaster at <a href="true"><a href="https://www.facebook.com/pages/Freelancer/640564905962853" data-hovercard="/ajax/hovercard/page.php?id=640564905962853">Freelancer</a></a></span></span><span class="fbProfileBylineFragment"><span class="fbProfileBylineIconContainer"><i class="mrs fbProfileBylineIcon img sp_9pvis2 sx_ed189a"></i></span><span class="fbProfileBylineLabel">Studied Web Design at <a href="true"><a href="https://www.facebook.com/pages/Universitatea-Tibiscus/242668875759811" data-hovercard="/ajax/hovercard/page.php?id=242668875759811">Universitatea Tibiscus</a></a></span></span><span class="fbProfileBylineFragment"><span class="fbProfileBylineIconContainer"><i class="mrs fbProfileBylineIcon img sp_9pvis2 sx_c1d4c8"></i></span><span class="fbProfileBylineLabel">Lives in <a href="true"><a href="https://www.facebook.com/pages/Timi%C8%99oara-poland/107982459236366" data-hovercard="/ajax/hovercard/page.php?id=107982459236366">Timișoara, poland</a></a></span></span></div></div><div><div class="mts detailedsearch_actions"><a href="/browse/mutual_friends/?uid=616226XXX" rel="dialog" ajaxify="/ajax/browser/dialog/mutual_friends/?uid=616226XXX" onclick="if (event.button == 0) { search_logged_ajax({"ab":"T_GS_BACKEND_ENTITY_LIMITATION","cururl":"https:\\/\\/www.facebook.com\\/adrianasdzxc","fc":1,"gc":0,"id":616226XXX,"init":"s:unknown","is_friend":true,"is_new_user":0,"locale":"en_US","o_type":1,"original_q":"adriantnt\\u0040yahoo.com","q":"adriantnt\\u0040yahoo.com","rank":0,"rc":0,"sid":"10000108681XXXX.3365508868..1","start":0,"typeahead_sid":null,"u":"https:\\/\\/www.facebook.com\\/adrianasdzxc","t":"c:mutual_friend"}); }" onmouseup="if (event.button != 0) { search_logged_ajax({"ab":"T_GS_BACKEND_ENTITY_LIMITATION","cururl":"https:\\/\\/www.facebook.com\\/adrianasdzxc","fc":1,"gc":0,"id":616226XXX,"init":"s:unknown","is_friend":true,"is_new_user":0,"locale":"en_US","o_type":1,"original_q":"adriantnt\\u0040yahoo.com","q":"adriantnt\\u0040yahoo.com","rank":0,"rc":0,"sid":"10000108681XXXX.3365508868..1","start":0,"typeahead_sid":null,"u":"https:\\/\\/www.facebook.com\\/adrianasdzxc","t":"c:mutual_friend"}); }" role="button">2 mutual friends</a> · <a href="/messages/adrianasdzxc" ajaxify="/ajax/messaging/composer.php?ids%5B0%5D=616226XXX&ref=search" rel="dialog" onclick="if (event.button == 0) { search_logged_ajax({"ab":"T_GS_BACKEND_ENTITY_LIMITATION","cururl":"https:\\/\\/www.facebook.com\\/adrianasdzxc","fc":1,"gc":0,"id":616226XXX,"init":"s:unknown","is_friend":true,"is_new_user":0,"locale":"en_US","o_type":1,"original_q":"adriantnt\\u0040yahoo.com","q":"adriantnt\\u0040yahoo.com","rank":0,"rc":0,"sid":"10000108681XXXX.3365508868..1","start":0,"typeahead_sid":null,"u":"https:\\/\\/www.facebook.com\\/adrianasdzxc","t":"c:action"}); }" onmouseup="if (event.button != 0) { search_logged_ajax({"ab":"T_GS_BACKEND_ENTITY_LIMITATION","cururl":"https:\\/\\/www.facebook.com\\/adrianasdzxc","fc":1,"gc":0,"id":616226XXX,"init":"s:unknown","is_friend":true,"is_new_user":0,"locale":"en_US","o_type":1,"original_q":"adriantnt\\u0040yahoo.com","q":"adriantnt\\u0040yahoo.com","rank":0,"rc":0,"sid":"10000108681XXXX.3365508868..1","start":0,"typeahead_sid":null,"u":"https:\\/\\/www.facebook.com\\/adrianasdzxc","t":"c:action"}); }" role="button">Send message</a></div></div></div></div></div></div></div> --></code>
答案 0 :(得分:5)
您可以使用DOM:
$dom = new DOMDocument();
@$dom->loadHTML($html); //or loadHTMLFile("filename")
$xpath = new DOMXPath($dom);
$textNode = $xpath->query('//div[contains(@class,"instant_search_title")]/a/text()');
$result = $textNode->item(0)->textContent;
正则表达方式:
if (preg_match('~<div class="instant_search_title\b[^<]++<a\b[^>]*+>\K[^<]++~',
$html, $match))
echo $match[0];
答案 1 :(得分:1)
我也不建议使用HTML的正则表达式,但这里有一个正常的文档正则表达式:
<div class="instant_search_title fsl fwb fcb"><a href=".+?">(.*?)<