将其从同步更改为异步Web请求

时间:2012-11-04 20:18:27

标签: asp.net asynchronous

我已经在网上挖了一段时间了,没有找到帮助我解决问题的代码示例..我看过示例代码但是我还没有“得到”它......

我已经阅读了,

http://msdn.microsoft.com/en-us/library/aa480507.aspx和  
http://msdn.microsoft.com/en-us/library/dd781401.aspx

但我似乎无法让它发挥作用..

我正在使用HTMLAGILITYPACK

今天我最多可以填写20个网页请求,

请求完成后,结果会被添加到字典中,之后一个方法会搜索该信息,如果找到则代码会退出,如果没有它再进行另一次webrequest,直到它的上限为止。我需要能够在找到所有内容时退出所有线程的异步调用。

就像这样

public void FetchAndParseAllPages()
    {
        PageFetcher fetcher = new PageFetcher();
        for (int i = 0; i < _maxSearchDepth; i += _searchIncrement)
        {
            string keywordNsearch = _keyword + i;
            ParseHtmldocuments(fetcher.GetWebpage(keywordNsearch));
            //this checks if the information was found or not, if 
            //found stop exit and add to database

            if (GetPostion() != 201)
            {   //ADD DATA TO DATABASE
                InsertRankingData(DocParser.GetSearchResults(), _theSearchedKeyword);
                return;
            }
        }
    }

这是取回页面的类

    public HtmlDocument GetWebpage(string urlToParse)
    {

        System.Net.ServicePointManager.Expect100Continue = false;
        HtmlWeb htmlweb = new HtmlWeb();
        htmlweb.PreRequest = new   HtmlAgilityPack.HtmlWeb.PreRequestHandler(OnPreRequest);
        HtmlDocument htmldoc = htmlweb.Load(@"urlToParse", "38.69.197.71", 45623, "PORXYUSER", "PROXYPASSWORD");

        return htmldoc;       
    }

    public bool OnPreRequest(HttpWebRequest request)
    {
       // request.UserAgent = RandomUseragent();
        request.KeepAlive = false;
        request.Timeout = 100000;
        request.ReadWriteTimeout = 1000000; 
        request.ProtocolVersion = HttpVersion.Version10;
        return true; // ok, go on
    }

如何使这个异步并使线程变得非常快?或者我应该在执行异步时使用线程吗?

1 个答案:

答案 0 :(得分:0)

好的,我解决了!至少我是这么认为的!执行时间下降到大约七秒钟。没有异步就花了我大约30秒的时间。

这里是我的代码供将来参考。编辑我使用控​​制台项目来测试代码。我也在使用html agilitypack。这是我的方式,任何有关如何进一步优化这一点的提示都很酷。

    public delegate HtmlDocument FetchPageDelegate(string url);

    static void Main(string[] args)
    {
        System.Net.ServicePointManager.DefaultConnectionLimit = 10;
        FetchPageDelegate del = new FetchPageDelegate(FetchPage);
        List<HtmlDocument> htmllist = new List<HtmlDocument>();
        List<IAsyncResult> results = new List<IAsyncResult>();
        List<WaitHandle> waitHandles = new List<WaitHandle>();

        DateTime start = DateTime.Now;
        for(int i = 0; i < 200; i += 10)
        {
            string url = @"URLSTOPARSE YOU CHANGE IT HERE READ FROM LIST OR ANYTHING";
            IAsyncResult result = del.BeginInvoke(url, null, null);
            results.Add(result);
            waitHandles.Add(result.AsyncWaitHandle);
        }

        WaitHandle.WaitAll(waitHandles.ToArray());

        foreach (IAsyncResult async in results)
        {   
            FetchPageDelegate delle = (async as AsyncResult).AsyncDelegate as FetchPageDelegate;
            htmllist.Add(delle.EndInvoke(async));
        }
        Console.ReadLine();

    }

    static HtmlDocument FetchPage(string url)
    {
        HtmlWeb htmlweb = new HtmlWeb();
        HtmlDocument htmldoc = htmlweb.Load(url);
        return htmldoc; 
    }