如何使用另一个DocumentText更新DocumentText

时间:2014-04-02 18:47:49

标签: c# .net multithreading winforms webbrowser-control

我有一个Windows Form,其控件名称为WebBrowser,形式为WebBrowser。我正在创建一个新的非UI线程,其中另一个WebBrowser实例名为newThreadBrowser。参考WebBrowser Control in a new thread

当触发documentcompleted事件时,我可以使用C# - Updating GUI using non-main Thread中提到的方法将URL写入文本框

现在,我正在尝试更新" formWebBrowser"的html。来自" newThreadBrowser"的html。它导致异常,说明指定的演员表无效。

WebBrowser control: "Specified cast is not valid."中,接受的回答是 -

  

WebBrowser是一个引擎盖下的COM组件。一个公寓线程,COM负责以线程安全的方式调用其方法。您的Navigate()调用适用于该原因,它实际上是在UI线程上执行的。什么不起作用的是DocumentText属性,它是在.NET包装器中实现的,并且它们在某种程度上弄错了代码。当CLR中的COM互操作支持注意到MTA中的线程试图访问存在于STA上的组件的属性时,它会发生炸弹。

问题

我应该怎么做才能从formWebBrowser中的newThreadBrowser渲染html?我不确定Control.Invoke()如何解决这个问题。

注意:此应用程序不是性能关键。所以即使需要一些时间来执行也没关系。

参考

  1. How to change webBrowser DocumentText?
  2. How do I extract info from a webpage?
  3. http://htmlagilitypack.codeplex.com/
  4. 来自WebBrowser.DocumentText Property

      

    如果要使用字符串处理工具操作WebBrowser控件中显示的HTML页面的内容,请使用此属性。例如,您可以使用此属性从数据库加载页面或使用正则表达式分析页面。设置此属性时,WebBrowser控件会在加载指定文本之前自动导航到about:blank URL。这意味着当您设置此属性时会发生Navigating,Navigated和DocumentCompleted事件,并且Url属性的值不再有意义。

    CODE

    public partial class Form1 : Form
    {
    
        public void WriteToTextBoxEvent(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
    
            #region Textbox
            if (this.textBox1.InvokeRequired)
            {
                //BeginInvoke is Asynchronus
                this.textBox1.BeginInvoke(new Action(() => WriteToTextBoxEvent(sender, e)));
            }
            else
            {
                textBox1.Text = e.Url.ToString();
            }
            #endregion
    
            #region WebBrowser
            if (this.formWebBrowser.InvokeRequired)
            {
                //BeginInvoke is Asynchronus
                this.textBox1.BeginInvoke(new Action(() => WriteToTextBoxEvent(sender, e)));
            }
            else
            {
                var newThreadBrowser = sender as WebBrowser;
                if (sender != null)
                {
                    //The function evaluation requires all threads to run
                    formWebBrowser.DocumentText = newThreadBrowser.DocumentText;
                }
            }
            #endregion
        }
    
    
    
        System.Windows.Forms.TextBox textBox1 = new TextBox();
        System.Windows.Forms.WebBrowser formWebBrowser = new WebBrowser();
    
        public Form1()
        {
    
            WriteLogFunction("App Satrt");
    
            // Web Browser
            #region Web Browser
            formWebBrowser.Location = new Point(10, 20);
            formWebBrowser.Size = new Size(1200, 900);
            this.Controls.Add(formWebBrowser);
    
            textBox1.Location = new Point(0, 0);
            textBox1.Size = new Size(800, 10);
            this.Controls.Add(textBox1);
    
            var th = new Thread(() =>
            {
                var newThreadBrowser = new WebBrowser();
    
                //To Process the DOM.
                newThreadBrowser.DocumentCompleted += browser_DocumentCompleted;
    
                //To update URL textbox
                newThreadBrowser.DocumentCompleted += WriteToTextBoxEvent;
    
                newThreadBrowser.ScriptErrorsSuppressed = true;
                newThreadBrowser.Navigate(GetHomoePageUrl());
    
                Application.Run();
            });
            th.SetApartmentState(ApartmentState.STA);
            th.Start();
    
            #endregion
    
            // Form1
            this.Text = "B2B Crawler";
            this.Size = new Size(950, 950);
    
        }
    
        List<string> visitedUrls = new List<string>();
        List<string> visitedProducts = new List<string>();
    
        private void ExerciseApp(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            var wbReceived = sender as WebBrowser;
            int catalogElementIterationCounter = 0;
            var elementsToConsider = wbReceived.Document.All;
            string productUrl = String.Empty;
            bool isClicked = false;
    
            foreach (HtmlElement e1 in elementsToConsider)
            {
                catalogElementIterationCounter++;
                string x = e1.TagName;
                String idStr = e1.GetAttribute("id");
                if (!String.IsNullOrWhiteSpace(idStr))
                {
                    //Each Product Navigation
                    if (idStr.Contains("catalogEntry_img"))
                    {
                        productUrl = e1.GetAttribute("href");
                        if (!visitedProducts.Contains(productUrl))
                        {
                            WriteLogFunction("productUrl -- " + productUrl);
                            visitedProducts.Add(productUrl);
                            isClicked = true;
    
                            e1.InvokeMember("Click");
                            //nextNavigationUrl = productUrl;
    
                            break;
                        }
    
                    }
                }
            }
    
    
            if (visitedProducts.Count == 4)
            {
                visitedProducts = new List<string>();
                isClicked = true;
                HomoePageNavigate(wbReceived);
            }
    
            if (!isClicked)
            {
                HomoePageNavigate(wbReceived);
            }
        }
    
        void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            ExerciseApp(sender, e);
        }
    
    
        private string GetHomoePageUrl()
        {
            return @"C:\Samples_L\MyTableTest.html";
        }
    
        private void HomoePageNavigate(WebBrowser bw)
        {
            WriteLogFunction("HomoePageNavigate");
            bw.Navigate(GetHomoePageUrl());
        }
    
        private void WriteLogFunction(string strMessage)
        {
            using (StreamWriter w = File.AppendText("log.txt"))
            {
                w.WriteLine("\r\n{0} ..... {1} ", DateTime.Now.ToLongTimeString(), strMessage);
            }
        }
    
     }
    

    MyTableTest.html

    <html>
    <head>
    
        <style type="text/css">
            table {
                border: 2px solid blue;
            }
    
            td {
                border: 1px solid teal;
            }
        </style>
    
    </head>
    <body>
    
        <table id="four-grid">
             <tr>
                <td>
                    <a href="https://www.wikipedia.org/" id="catalogEntry_img63666">
    
                        <img src="ssss"
                            alt="B" width="70" />
                    </a>
                </td>
                <td>
                    <a href="http://www.keralatourism.org/" id="catalogEntry_img63667">
    
                        <img src="ssss"
                            alt="A" width="70" />
                    </a>
                </td>
            </tr>
            <tr>
                <td>
                    <a href="https://stackoverflow.com/users/696627/lijo" id="catalogEntry_img63664">
    
                        <img src="ssss"
                            alt="G" width="70" />
                    </a>
                </td>
                <td>
                    <a href="http://msdn.microsoft.com/en-US/#fbid=zgGLygxrE84" id="catalogEntry_img63665">
    
                        <img src="ssss"
                            alt="Y" width="70" />
                    </a>
                </td>
            </tr>
    
        </table>
    </body>
    
    </html>
    

1 个答案:

答案 0 :(得分:1)

首先,请注意WebBrowser.DocumentText是静态的,它保留原始内容而不进行任何DOM / AJAX更改。要获取实际的当前HTML,请在后台线程上执行此操作:

var html = hiddenWebBrowser.Document.GetElementsByTagName("html")[0].OuterHtml;

然后,您可以在UI线程上更新WebBrowser的另一个实例:

mainForm.BeginInvoke(new Action(() => mainForm.webBrowser.DocumentText = html));

注意,BeginInvoke是异步的,DocumentText赋值也是异步的。当HTML加载时,DocumentCompleted将触发mainForm.webBrowser事件。