如何通过html Agility Pack 2获取电子邮件地址

时间:2013-11-09 14:51:58

标签: c# javascript html-agility-pack selectnodes

如何从网站上抓取电子邮件地址?

我尝试从一个似乎受某些JavaScript保护的网站上收到一封电子邮件......

以下是HTML代码:

<p class="email">
<a href="mailto:info@aryanaz.ir" class="email">
    info@aryanaz.ir
<script type="text/javascript">
/* <![CDATA[ */
(function(){try{var s,a,i,j,r,c,l,b=document.getElementsByTagName("script");l=b[b.length-1].previousSibling;a=l.getAttribute('data-cfemail');if(a){s='';r=parseInt(a.substr(0,2),16);for(j=2;a.length-j;j+=2){c=parseInt(a.substr(j,2),16)^r;s+=String.fromCharCode(c);}s=document.createTextNode(s);l.parentNode.replaceChild(s,l);}}catch(e){}})();
/* ]]> */
</script></a>
</p>

我使用此代码捕获受保护的值,但它不起作用:

              HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
  var Email = from HtmlNode n in doc.DocumentNode.SelectNodes("//a[contains(@href, 'mailto:')]")
                              select n;
                foreach (HtmlNode node in Email )
                {
                    string email = node.InnerHtml.Trim();

                    if (node.InnerHtml.Trim() != "")
                    {
                        ClassBase.ENonQuery("addfullvalueemail ", System.Data.CommandType.StoredProcedure, new SqlParameter[]
            {
                  new SqlParameter("@Email ",email ),                  

            });
                    }
                }

0 个答案:

没有答案