<h3 style="border-bottom: 3px solid #CCC;" class="margint15 marginb15">Headlines</h3>
<table cellpadding="0" cellspacing="0" border="0" class="nc" width="100%">
<tr>
<th class="left" colspan="2">Latest Headlines</th>
</tr>
<tr>
<td class="left" width="620"> <a href="/blogs/rhb/79680.jsp" style="color:#06a;">Trading Stocks - 10
July 2015 - Globetronics | A&M | Salcon | Comintel | Homeritz |
MMSV</a> </td>
</tr>
</table>
我想从标签中提取数据&#34;&#34;哪个班级=&#34; nc&#34;直到标签结束&#34;&#34;。如何为preg_match编写模式?
答案 0 :(得分:1)
实际上,这已经在这里讨论了一千次,better not use some regular expression to grab html tags(可能有些情况下工作得很好)。为了圣诞精神,这里有一个例子用于您的目的(抓取不属于您的网站的财务数据;-))考虑改为使用XML parser:
<?php
$str='<container>
<h3 style="border-bottom: 3px solid #CCC;" class="margint15
marginb15">Headlines</h3> <table cellpadding="0" cellspacing="0"
border="0" class="nc" width="100%"> <tr><th class="left"
colspan="2">Latest Headlines</th></tr> <tr><td class="left" width="620"> <a
href="/blogs/rhb/79680.jsp" style="color:#06a;">Trading Stocks - 10
July 2015 - Globetronics | A&M | Salcon | Comintel | Homeritz |
MMSV</a> </td></tr></table>
</container>';
$xml = simplexml_load_string($str);
print_r($xml);
// now you can loop over the table rows with
foreach ($xml->table->tr as $row) {
// do whatever you want with it
// child elements can be accessed likewise
}
?>
提示:显然,我编写了container
标记,在您的情况下可能是html
。
附录:正如Scuzzy指出的那样,让自己熟悉xpath(here's a good starting point),这种组合非常强大。
答案 1 :(得分:0)
你应该这样做:
$str = '<h3 style="border-bottom: 3px solid #CCC;" class="margint15 marginb15">Headlines</h3><table cellpadding="0" cellspacing="0" border="0" class="nc" width="100%"> <tr><th class="left" colspan="2">Latest Headlines</th></tr> <tr><td class="left" width="620"> <a href="/blogs/rhb/79680.jsp" style="color:#06a;">Trading Stocks - 10 July 2015 - Globetronics | A&M | Salcon | Comintel | Homeritz | MMSV</a> </td></tr></table>';
preg_match_all('/<table.*?>(.*?)<\/table>/si', $str, $matches);
echo "<pre>";
print_r( strip_tags($matches[1][0]) );
die();
谢谢!