$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"> <script type="text/javascript"> //<![CDATA[ document.write(HttpSocks^Xinemara^47225); //]]> </script> </td><td class="t_type"> 4 </td>';
$regex = "/<td class=\"t_ip\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\\n)*<td class=\"t_port\">(?:.|\\n)*\^([0-9]{1,5})(?:.|\\n)*<td class=\"t_type\">\\s*([0-9])/";
preg_match($regex, $string, $matches);
$newString = $matches[1] . ':' . $matches[2] . ' ' . $matches[3];
print_r($newString);
正则表达式:
$regex = "/<td class=\"t_ip\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\\n)*<td class=\"t_port\">(?:.|\\n)*\^([0-9]{1,5})(?:.|\\n)*<td class=\"t_type\">\\s*([0-9])/";
以这种方式提取信息:
85.185.244.101:22088 4
但如果重复两次以上不起作用
$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td><td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td><td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td>';
那必须改变以使其有效吗?
答案 0 :(得分:1)
我使用解析器而不是正则表达式,HTML正则表达式不顺利。你可以这样做:
<?php
$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"> <script type="text/javascript"> //<![CDATA[ document.write(HttpSocks^Xinemara^47225); //]]> </script> </td><td class="t_type"> 4 </td>';
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($string);
libxml_use_internal_errors(false);
$cells = $doc->getElementsByTagName('td');
foreach($cells as $cell) {
if(preg_match('/\bt_(ip|type)\b/', $cell->getAttribute('class'), $type)){
echo $type[1] . "=" . trim($cell->nodeValue) . "\n";
}
}
输出:
ip=85.185.244.101
type=4
如果您需要验证IP,可以添加以下内容:
if($type[1] == 'ip') {
if(filter_var($cell->nodeValue, FILTER_VALIDATE_IP)) {
echo 'valid ip' . $cell->nodeValue;
} else {
echo 'invalid ip' . $cell->nodeValue;
}
}
我不知道您提供的字符串22088
的来源。
答案 1 :(得分:0)
.gitignore