我的网站是否被爬虫攻击

时间:2014-04-22 06:30:49

标签: internet-explorer http browser web-crawler

我创建了我的网站,让我们说abc.com,我使用这个网站通过Facebook,Twitter和电子邮件与一些查询字符串共享。 点击此链接从Facebook,Twitter或电子邮件,网站abc.com记录网站视图活动。

在facebook和twitter上发布adv时,abc.com会收到来自抓取工具的点击并记录网站浏览活动。所以我写了这段代码来检查iscrawler请求。 现在问题是这个代码不适用于Internet Explorer。 (Request.UrlReferrer为null)

    /// <summary>
    /// Check whether the request via Crawl request / email / social network
    /// </summary>
    /// <returns></returns>
    private bool isCrawlerRequest()
    {
        try{

        //any mannual click from facebook or twitter sends URLReferrer while crawlers do not send URLReferrer. 
        //Therefore if URLReferrer is missing then we can consider it a crawller. But Manual click from mail boxes also do not send URLReferer though they do have Query String Param "Source= email". So if URLReferrer and Utm_Source both are missing, we are assuming that it is a crawller.
        //Also, any correct UserAgent would not contain ".com", "www.", "http:". Only crawller User agents contain such text
        bool urlReferrer = false;
        if (Request.UrlReferrer != null)
            if (!string.IsNullOrEmpty(Request.UrlReferrer.OriginalString))
                urlReferrer = true;

        if (urlReferrer || (!urlReferrer && (string.Equals(Source, "email", StringComparison.CurrentCultureIgnoreCase))
                && !UIHelper.Contains(Request.UserAgent, ".com", StringComparison.OrdinalIgnoreCase)
                && !UIHelper.Contains(Request.UserAgent, "www.", StringComparison.OrdinalIgnoreCase)
                && !UIHelper.Contains(Request.UserAgent, "https:", StringComparison.OrdinalIgnoreCase)
                && !UIHelper.Contains(Request.UserAgent, "http:", StringComparison.OrdinalIgnoreCase))
            {
                return false;
            }
        }
        catch (Exception ex)
        {                
        }
        return true;
    }  

0 个答案:

没有答案