我在文本文件中有一个URL列表,我想使用C#webBrowser类访问该文件,并将每个网站的内容保存到某个地方。问题是,程序并不总是访问新URL。
正确访问链接1和2,然后链接3上的浏览器窗口不刷新。链接4再次工作,而5,6和7失败。链接8工作,9到15失败。 16作品等......
以下是网址的示例列表:
http://www.example.com/somefile_7.html*SomeOtherText1*SomeAdditionalText1
http://www.example.com/somefile_12.html*SomeOtherText1*SomeAdditionalText2
static int counter_getURL = 0;
private void Form1_Load(object sender, EventArgs e)
{
nextTurn();
}
void startBrowser(string url)
{
webBrowser1.Navigate(new Uri(url), "_self");
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(get_browser_string);
}
void get_browser_string(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Display the content of the website in textBox1
textBox1.Text = webBrowser1.Document.Body.InnerText;
MessageBox.Show("Next");
nextTurn();
}
public void nextTurn()
{
startBrowser(getURL());
}
public string getURL()
{
string url = "";
string[] input = System.IO.File.ReadAllLines(@"C:\Users\WORKSTATION01\Desktop\url_list.txt", Encoding.Default);
// Get the URL only
string[] splitted = input[counter_getURL].Split(new char[] { '*' });
url = splitted[0];
counter_getURL++;
return url;
}
答案 0 :(得分:1)
DocumentCompleted也会触发网页内的FRAME。我的猜测是,您网址的某些网页上有FRAME,这会干扰您的代码。