如何从网站上抓取电子邮件地址?
我尝试从一个似乎受某些JavaScript保护的网站上收到一封电子邮件......
以下是HTML代码:
<p class="email">
<a href="mailto:info@aryanaz.ir" class="email">
info@aryanaz.ir
<script type="text/javascript">
/* <![CDATA[ */
(function(){try{var s,a,i,j,r,c,l,b=document.getElementsByTagName("script");l=b[b.length-1].previousSibling;a=l.getAttribute('data-cfemail');if(a){s='';r=parseInt(a.substr(0,2),16);for(j=2;a.length-j;j+=2){c=parseInt(a.substr(j,2),16)^r;s+=String.fromCharCode(c);}s=document.createTextNode(s);l.parentNode.replaceChild(s,l);}}catch(e){}})();
/* ]]> */
</script></a>
</p>
我使用此代码捕获受保护的值,但它不起作用:
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
var Email = from HtmlNode n in doc.DocumentNode.SelectNodes("//a[contains(@href, 'mailto:')]")
select n;
foreach (HtmlNode node in Email )
{
string email = node.InnerHtml.Trim();
if (node.InnerHtml.Trim() != "")
{
ClassBase.ENonQuery("addfullvalueemail ", System.Data.CommandType.StoredProcedure, new SqlParameter[]
{
new SqlParameter("@Email ",email ),
});
}
}