我有这个
$str= '</b><b>Tech Fax:<br/>
</b><b>Tech Fax Ext:<br/>
</b><b>Tech Email: </b><a href="mailto:rsurikov@gmail.com">rsurikov@gmail.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=69.93.127.10&output=nice">ns1.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=65.19.178.10&output=nice">ns2.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=75.127.96.10&output=nice">ns3.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=207.192.70.10&output=nice">ns4.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=109.74.194.10&output=nice">ns5.linode.com</a><br/>
<b>DNSSEC:</b>Unsigned<br/>
<b>Registrar Abuse Contact Email: </b><a href="mailto:abuse-contact@publicdomainregistry.com">abuse-contact@publicdomainregistry.com</a><br/>
<b>Registrar Abuse Contact Phone: </b>+1-2013775952<br/>
<b>URL of the ICANN WHOIS Data Problem Reporting System:<br/>
</b><a href="http://wdprs.internic.net" target="_blank">http://wdprs.internic.net</a>/<br/>
>>>Last update of WHOIS database: 2015-07-01T16:22:28+0000Z<br />
</td><td bgcolor="#C0C0C0" width="53" rowspan="2">
</td></tr>
<tr align="left" valign="top"><td bgcolor="#C0C0C0" width="639">
</td></tr>
</table><br />
<form name="queryform" method="post" action="/index.php">
<table cellpadding="6" cellspacing="0" border="0" width="540" dir="ltr">
<tr><td bgcolor="#C0C0C0">
<table width="100%" cellpadding="0" cellspacing="0" border="0" dir="ltr">
<tr class="upperrow">
<td align="left" valign="top" nowrap="nowrap"><font face="Arial" size="+0"><b>Enter any domain name:</b></font></td>
</tr>
<tr class="middlerow">
<td align="center" valign="middle" nowrap="nowrap">
<input type="text" name="query" value="" class="queryinput" size="20" /> <input type="submit" name="submit" value="Check Domain" /></td>
</tr>
<tr class="lowerrow">
<td align="right" valign="bottom"></td>
</tr>
</table>'
我需要PHP中的正则表达式来检查名称服务器的行:然后为我保存整行。 我需要$ match为:
<b>Name Server: </b><a href="/index.php?query=69.93.127.10&output=nice">ns1.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=65.19.178.10&output=nice">ns2.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=75.127.96.10&output=nice">ns3.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=207.192.70.10&output=nice">ns4.linode.com</a><br/>
<b>Name Server: </b><a href="/index.php?query=109.74.194.10&output=nice">ns5.linode.com</a><br/>
也不总是包含4行&#34;名称服务器:&#34;在$ str中,有时是两行,有时是5,这就是我写的正则表达式的问题,这里是:
/Name Server[^:]*:\s*(.*)\s(.*)/i
答案 0 :(得分:0)
您可以将DOMDocument与DOMXPath结合使用:
$dom = new DOMDocument;
@$dom->loadHTML($str);
$xp = new DOMXPath($dom);
$links = $xp->query('//b[text()="Name Server: "]/following-sibling::a[1]');
foreach ($links as $link) {
echo $link->nodeValue . PHP_EOL;
}
xpath查询意味着:
// # anywhere in the DOM tree
b # a b tag
[text()="Name Server: "] # condition: the text content must be "Name Server: "
/following-sibling::a[1] # the first following "a" tag
答案 1 :(得分:-1)
您必须使用preg_match_all功能。以下面的简短脚本为例:
Douglas-Peucker
将输出
<?php
$a = "abc\ndef\naaa\naba\nxyz";
$matches = array();
preg_match_all("/a.*/", $a, $matches);
print_r($matches);
?>
答案 2 :(得分:-1)
一般来说,尝试使用正则表达式搜索/解析html是一个坏主意。但是,如果你坚持并且确定html与你上面发布的内容差异很大,你可以这样做:
/^(?:<b>Name Server: <\/b><a href="\/index.php\?query=\d{1,3}\.\d{1,3}.\d{1,3}\.\d{1,3}\&output=nice">\w+\.\w+\.\w+<\/a><br\/>.)+^/sm
您可以在此处查看其工作原理:https://regex101.com/r/dU6gH4/1