Question

可能重复：
Regexp for extracting a mailto: address

我想通过以下脚本获取带有页面的电子邮件，但我不确定要在preg_match_all中使用的模式。

 $original_file = file_get_contents("http://www.example.com/");
 $stripped_file = strip_tags($original_file, "<a>");
 preg_match_all("/<a(?:[^>]*)href=\"([^\"]*)\"(?:[^>]*)>(?:[^<]*)<\/a>/is", $stripped_file, $matches);

 header("Content-type: text/plain"); 
 print_r($matches); //View the array to see if it worked

Answer 1

使用像PHP Simple HTML Dom Parser这样的HTML解析器可能会更加幸运，它可以让您以更自然的方式解析HTML文档，例如：

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

然后遍历返回元素的数组，并检查href是否有@符号。

Answer 2

编辑：我刚刚意识到你的意思是mailto：links

在这里回答：

Regexp for extracting a mailto: address

Php抓取以获取电子邮件模式

2 个答案: