Question

如何捕获Webbrowser自动下载的下载文件（例如.html，png等）

例如，如果网站通过javascript每30秒下载一个html文件，我该如何使用网络浏览器控件来捕获这个html？

Answer 1

我的方法将是

步骤1抓取脚本元素的内容（具有html文件路径的变量）

//using HtmlAgilityPack
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
//Considering first script tag (you need to check your decentands)
var script = doc.DocumentNode.Descendants()
                             .Where(n => n.Name == "script")
                             .First().InnerText; 

// Return the data of the spect and stringify it into a proper JSON object
var engine = new Jurassic.ScriptEngine();
var result = engine.Evaluate("(function() { " + script + " return spects; })()");
var json = JSONObject.Stringify(engine, result);

Console.WriteLine(json);
Console.ReadKey();

第2步：在其他网络浏览器控件或同一浏览器控件中打开html页面

WebBrowser wb2 = new WebBrowser();
wb2.AllowNavigation = true;
wb2.Navigate(jsVariableAsString);

第3步：保存网页浏览器页面

var html = wb2.DocumentText.ToString();

或wb2.ShowSaveAsDialog（）;

让我知道它是否有效。

.NET Web浏览器控件捕获下载

1 个答案: