我正在尝试从该字符串中获取href
,但是我不能,因为它在链接中有空格。我尝试通过正则表达式来做,但是我不是正则表达式方面的专家。我尝试使用互联网上的示例,但没有得到应有的价值。
<table class="grid border" cellspacing="0" border="0" id="ctl00_ContentBody_grvStudentResult" style="width:100%;border-collapse:collapse;">
<tbody>
<tr>
<th align="left" valign="middle" scope="col">Code</th>
<th align="left" valign="middle" scope="col">Subject</th>
<th align="left" valign="middle" scope="col">Status</th>
<th align="center" valign="middle" scope="col">Score</th>
<th align="center" valign="middle" scope="col">Result Date</th>
</tr>
<tr class="detail1">
<td align="left" valign="middle">
DipPM15PQ
</td>
<td align="left" valign="middle">
<span class="">
1561|
<a onclick="return hs.htmlExpand( this, { objectType: 'iframe', width: 800, height: 600, outlineWhileAnimating: true, preserveContent: false } )" href="DetailResults.aspx?sid=90651&id=1769095&nsub= [Project Quality] &Subjectid=1561" title="Approved ">
<img alt="" style="display: online" src="../Images/Common/r_Approved.gif" border="0">
[Project Quality] </a>
</span>
<span class="selected">
</span>
</td>
<td align="left" valign="middle">
<span class="enable">
Competent
</span>
<center style="display: none">
<span disabled="disabled"><input id="ctl00_ContentBody_grvStudentResult_ctl02_chkAP" type="checkbox" name="ctl00$ContentBody$grvStudentResult$ctl02$chkAP" checked="checked" disabled="disabled"><label for="ctl00_ContentBody_grvStudentResult_ctl02_chkAP"> </label></span>
</center>
</td>
<td align="center" valign="middle">
75.00
</td>
<td align="center" valign="middle">
11/11/2018
</td>
</tr>
<tr class="detail1">
<td align="left" valign="middle">
DipPM15PC
</td>
<td align="left" valign="middle">
<span class="">
1559|
<a onclick="return hs.htmlExpand( this, { objectType: 'iframe', width: 800, height: 600, outlineWhileAnimating: true, preserveContent: false } )" href="DetailResults.aspx?sid=90898&id=1769088&nsub= [Project Costs] &Subjectid=1559" title="NAN ">
<img alt="" style="display: online" src="../Images/Common/r_.gif" border="0">
[Project Costs] </a>
</span>
<span class="selected">
[progress]
</span>
</td>
<td align="left" valign="middle">
<span class="disable">
</span>
<center style="display: none">
</center>
</td>
<td align="center" valign="middle">
</td>
<td align="center" valign="middle">
</td>
</tr>
</tbody>
答案 0 :(得分:1)
解析HTML的更好方法是使用DOMDocument
。您可以使用它来处理HTML,并从HTML中的所有hrefs
标签中找到<a>
。我假设您的HTML位于名为$html
的变量中:
$doc = new DOMDocument();
$doc->loadHTML($html);
$anchors = $doc->getElementsByTagName('a');
foreach ($anchors as $a) {
$urls[] = $a->attributes->getNamedItem('href')->nodeValue . "\n";
}
foreach ($urls as $url) {
echo $url;
}
输出
DetailResults.aspx?sid=90651&id=1769095&nsub= [Project Quality] &Subjectid=1561
DetailResults.aspx?sid=90898&id=1769088&nsub= [Project Costs] &Subjectid=1559
如果您必须使用正则表达式,那么它将适用于您的示例数据:
preg_match_all('/href="([^"]+)/', $html, $matches);
print_r($matches[1]);
输出:
Array (
[0] => DetailResults.aspx?sid=90651&id=1769095&nsub= [Project Quality] &Subjectid=1561
[1] => DetailResults.aspx?sid=90898&id=1769088&nsub= [Project Costs] &Subjectid=1559
)
答案 1 :(得分:0)
我不是专家,但这对我有用
$string ='<a onclick="return hs.htmlExpand( this, { objectType: \'iframe\', width: 800, height: 600, outlineWhileAnimating: true, preserveContent: false } )" href="DetailResults.aspx?sid=90651&id=1769095&nsub= [Project Quality] &Subjectid=1561" title="Approved ">
<img alt="" style="display: online" src="../Images/Common/r_Approved.gif" border="0">
[Project Quality] </a>';
preg_match_all( '~<a .*?href=[\'"](.*?)[\'"].*?>~', $string, $match );
$urls=array();//array of link
foreach($match as $m){
if (isset($m[0])) {
$url[]= $m[0];
}}