挂起线程,直到WebBrowser完成加载

时间:2012-07-03 18:43:59

标签: c#

我正在尝试浏览网站并使用Windows窗体中的WebBrowser控件以编程方式在页面上进行一些工作。在找到阻止我的线程的方法时,我找到了this,直到触发了WebBrowser的DocumentCompleted事件。鉴于此,这是我目前的代码:

public partial class Form1 : Form
{
    private AutoResetEvent autoResetEvent;

    public Form1()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {
        Thread workerThread = new Thread(new ThreadStart(this.DoWork));
        workerThread.SetApartmentState(ApartmentState.STA);
        workerThread.Start();
    }

    private void DoWork()
    {
        WebBrowser browser = new WebBrowser();
        browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
        browser.Navigate(login_page);
        autoResetEvent.WaitOne();
        // log in

        browser.Navigate(page_to_process);
        autoResetEvent.WaitOne();
        // process the page
    }

    private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        autoResetEvent.Set();
    }
}

线程看起来不太必要,但是当我扩展此代码以通过网络接受请求时(线程将侦听连接,然后处理请求)。另外,我不能只将处理代码放在DocumentCompleted处理程序中,因为我必须导航到几个不同的页面并在每个页面上做不同的事情。

现在,根据我的理解,这不起作用的原因是因为DocumentCompleted事件使用的是调用WaitOne()的相同线程,因此在WaitOne()返回之前不会触发事件(从不,在这种情况下)。

有趣的是,如果我从工具箱中添加一个WebBrowser控件(拖放),然后使用它进行导航,此代码可以正常工作(除了在调用中调用Navigate之外不做任何更改调用 - 见下文)。但是,如果我手动将WebBrowser控件添加到Designer文件,则它不起作用。我真的不希望在我的表单上显示可见的WebBrowser,我只想报告结果。

public delegate void NavigateDelegate(string address);
browser.Invoke(new NavigateDelegate(this.browser.Navigate), new string[] { login_page });

我的问题是:在浏览器的DocumentCompleted事件触发之前,暂停线程的最佳方法是什么?

3 个答案:

答案 0 :(得分:1)

克里斯,

我告诉你一个解决这个问题的可能的实现,但是请看一下这里的评论,我必须面对并修复一切,因为我期待的一切正常。 这里是一个在webBrowser中对页面执行某些活动的方法的示例(请注意,webBrowser是我的表单中的表单的一部分):

    internal ActionResponse CheckMessages() //Action Response is a custom class of mine to store some data coming from pages
        {
        //go to messages
        HtmlDocument doc = WbLink.Document; //wbLink is a referring link to a webBrowser istance
        HtmlElement ele = doc.GetElementById("message_alert_box");
        if (ele == null)
            return new ActionResponse(false);

        object obj = ele.DomElement;
        System.Reflection.MethodInfo mi = obj.GetType().GetMethod("click");
        mi.Invoke(obj, new object[0]);

        semaphoreForDocCompletedEvent = WaitForDocumentCompleted();  //This is a simil-waitOne statement (1)
        if (!semaphoreForDocCompletedEvent)
            throw new Exception("sequencing of Document Completed events is failed.");

        //get the list
        doc = WbLink.Document;
        ele = doc.GetElementById("mailz");
        if (!ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000)) //This is a simil-waitOne statement (2)

            ele = doc.GetElementById("mailz");
        ele = doc.GetElementById("mailz");

        //this contains a tbody
        HtmlElement tbody = ele.FirstChild;

        //count how many elemetns are espionage reports, these elements are inline then counting double with their wrappers on top of them.
        int spioCases = 0;
        foreach (HtmlElement trs in tbody.Children)
        {
            if (trs.GetAttribute("id").ToLower().Contains("spio"))
                spioCases++;
        }

        int nMessages = tbody.Children.Count - 2 - spioCases;

        //create an array of messages to store data
        GameMessage[] archive = new GameMessage[nMessages];

        for (int counterOfOpenMessages = 0; counterOfOpenMessages < nMessages; counterOfOpenMessages++)
        {

            //open first element
            WbLink.ScriptErrorsSuppressed = true;
            ele = doc.GetElementById("mailz");
            //this contains a tbody
            tbody = ele.FirstChild;

            HtmlElement mess1 = tbody.Children[1];
            int idMess1 = int.Parse(mess1.GetAttribute("id").Substring(0, mess1.GetAttribute("id").Length - 2));
            //check if subsequent element is not a spio report, in case it is then the element has not to be opened.
            HtmlElement mess1Sibling = mess1.NextSibling;
            if (mess1Sibling.GetAttribute("id").ToLower().Contains("spio"))
            {
                //this is a wrapper for spio report
                ReadSpioEntry(archive, counterOfOpenMessages, mess1, mess1Sibling);
                //delete first in line
                DeleteFirstMessageItem(doc, ref ele, ref obj, ref mi, ref tbody);
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6); //This is a simil-waitOne statement (3)

            }
            else
            {
                //It' s anormal message
                OpenMessageEntry(ref obj, ref mi, tbody, idMess1); //This opens a modal dialog over the page, and it is not generating a DocumentCompleted Event in the webBrowser

                //actually opening a message generates 2 documetn completed events without any navigating event issued
                //Application.DoEvents();
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);

                //read element
                ReadMessageEntry(archive, counterOfOpenMessages);

                //close current message
                CloseMessageEntry(ref ele, ref obj, ref mi);  //this closes a modal dialog therefore is not generating a documentCompleted after!
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);
                //delete first in line
                DeleteFirstMessageItem(doc, ref ele, ref obj, ref mi, ref tbody); //this closes a modal dialog therefore is not generating a documentCompleted after!
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);
            }
        }
        return new ActionResponse(true, archive);
    }

实际上,此方法需要一个MMORPG页面,并读取其他玩家发送给该帐户的消息,并通过ReadMessageEntry方法将它们存储在ActionResponse类中。

除了真正依赖于案例(并且对你没用)的代码的实现和逻辑之外,还有一些有趣的元素可能很适合你。 我在代码中添加了一些注释,并突出显示了3个重点[包含符号(1)(2)(3)]

算法是:

1)到达页面

2)从webBrowser获取基础文档

3)找到要点击的元素进入消息页面[完成:HtmlElement ele = doc.GetElementById("message_alert_box");]

4)通过MethodInfo实例和反射式调用触发单击它的事件[这会调用另一个页面,因此DocumentCompleted迟早会到达]

5)等待完成的文件被调用,然后继续[完成:semaphoreForDocCompletedEvent = WaitForDocumentCompleted();在第(1)点]

6)在页面更改后从webBrowser中获取新文档

7)在页面上找到一个特定的锚点,用于定义我想要阅读的消息的位置

8)确保页面中存在此类TAG(因为可能有一些AJAX延迟了我想要读取的内容)[完成:ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000)即点(2)]

9)执行读取每条消息的整个循环,这意味着打开一个位于同一页面上的模态对话框,因此不生成DocumentCompleted,准备好后读取它,然后关闭它,然后重新循环。对于这种特殊情况,我在(3)

处使用称为semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);的(1)重载

现在我使用三种方法来暂停,检查和阅读:

(1)在DocumentCompleted被引发时停止而不会过度使用可能用于多个单一目的的DocumentCompleted方法(如你的情况)

private bool WaitForDocumentCompleted()
        {
            Thread.SpinWait(1000);  //This is dirty but working
            while (Program.BrowsingSystem.IsBusy) //BrowsingSystem is another link to Browser that is made public in my Form and IsBusy is just a bool put to TRUE when Navigating event is raised and but to False when the DocumentCOmpleted is fired.
            {
                Application.DoEvents();
                Thread.SpinWait(1000);
            }

            if (Program.BrowsingSystem.IsInfoAvailable)  //IsInfoAvailable is just a get property to cover webBroweser.Document inside a lock statement to protect from concurrent accesses.
            {
                return true;
            }
            else
                return false;
        }

(2)等待页面中的特定标签可用:

public static bool WaitForAvailability(this HtmlElement tag, string id, HtmlDocument documentToExtractFrom, long maxCycles)
        {
            bool cond = true;
            long counter = 0;
            while (cond)
            {
                Application.DoEvents(); //VERIFY trovare un modo per rimuovere questa porcheria
                tag = documentToExtractFrom.GetElementById(id);
                if (tag != null)
                    cond = false;
                Thread.Yield();
                Thread.SpinWait(100000);
                counter++;
                if (counter > maxCycles)
                    return false;
            }
            return true;
        }

(3)等待DocumentCompleted的肮脏技巧,因为没有帧需要在页面上重新加载!

private bool WaitForDocumentCompleted(int seconds)
    {
        int counter = 0;
        while (Program.BrowsingSystem.IsBusy)
        {
            Application.DoEvents();
            Thread.Sleep(1000);
            if (counter == seconds)
            {
            return true;
            }
            counter++;
        }
        return true;
    }

我还通过了DocumentCompleted Methods和Navigating来为您提供有关我如何使用它们的全貌。

private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            if (Program.BrowsingSystem.BrowserLink.ReadyState == WebBrowserReadyState.Complete)
            {
                lock (Program.BrowsingSystem.BrowserLocker)
                {
                    Program.BrowsingSystem.ActualPosition = Program.BrowsingSystem.UpdatePosition(Program.BrowsingSystem.Document);
                    Program.BrowsingSystem.CheckContentAvailability();
                    Program.BrowsingSystem.IsBusy = false;
                }
            }
        }

private void webBrowser_Navigating(object sender, WebBrowserNavigatingEventArgs e)
        {
            lock (Program.BrowsingSystem.BrowserLocker)
            {
                Program.BrowsingSystem.ActualPosition.PageName = OgamePages.OnChange;
                Program.BrowsingSystem.IsBusy = true;
            }
        }

如果你现在知道这里介绍的实现背后的细节(请希望从S链接其他网站不是问题,请查看here以了解DoEvents()背后的混乱。溢出)。

关于在Form实例中使用它时需要在Invoke中调用Navigate方法这一事实的一个小小的最后一点:这很清楚你需要一个Invoke因为需要处理的方法webBrowser(或甚至将其作为引用变量放入作用域)需要在webBrowser本身的同一个线程上启动!

此外,如果WB是某种Form容器的子节点,它还需要实例化它的线程与Form创建的相同,并且为了传递性,所有需要在WB上工作的方法都需要要在Form线程上调用(在这种情况下,调用会在Form本机线程上重定位您的调用)。 我希望这对你有用(我只是在我的母语代码中留下了// VERIFY注释,让你知道我对Application.DoEvents()的看法。)

亲切的问候, 亚历

答案 1 :(得分:0)

HAH!我有同样的问题。您可以通过事件处理来完成此操作。如果你在页面中途停止一个线程,它将需要等到页面完成。您可以通过附加

轻松完成此操作
 Page.LoadComplete += new EventHandler(triggerFunction);

在triggerFunction中你可以这样做

triggerFunction(object sender, EventArgs e)
{
     autoResetEvent.reset();
}

如果有效,请告诉我。我最终没有使用我的线程,而只是把东西放入triggerFunction。某些语法可能不是100%正确,因为我正在回答我的问题

答案 2 :(得分:0)

修改

在这样的Initialize组件方法中注册,而不是在同一方法中注册。

WebBrowser browser = new WebBrowser(); 
WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);

在DocumentCompleted事件中检查时,ReadyState将告诉您文档加载的进度。

void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
   if (browser.ReadyState == WebBrowserReadyState.Complete)
{

}
}