我知道为什么我的异常会引起但却找不到更好的解决方法。 到目前为止,我尝试过所有内容,但是如果不改变方法,我不会得到我想要的相同结果。方法Scrape(字符串链接,Regex表达式,Webclient webClient)返回字符串列表。这段代码在没有多线程的情况下工作正常,但是在一个线程上爬行的过程非常慢。我的目标是至少运行15个线程。 (我也试过增加stacksize)
private void Crawl(List<String> links)
{
List<String> scrapedLinks = new List<String>();
foreach (string link in links)
{
List<String> scrapedItems = Scrape(link, new Regex(iTalk_TextBox_Small2.Text), new WebClient());
foreach (string item in scrapedItems) listBox1.Invoke(new Action(delegate () { listBox1.Items.Add(item); }));
iTalk_Label4.Invoke(new Action(delegate () { iTalk_Label4.Text = "Scraped Items: " + listBox1.Items.Count; }));
if (scrapedItems.Count > 0 || !Properties.Settings.Default.Inspector)
{
foreach (string scrapedLink in Scrape(link, new Regex(@"https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)"), new WebClient()))
{
if(!Properties.Settings.Default.Blacklist.Contains(scrapedLink)) scrapedLinks.Add(scrapedLink);
}
scrapedLinksTotal += scrapedLinks.Count;
}
iTalk_Label5.Invoke(new Action(delegate () { iTalk_Label5.Text = "Scraped Links: " + scrapedLinksTotal; }));
}
Crawl(scrapedLinks);
}
答案 0 :(得分:2)
在99%的情况下,堆栈溢出是由无限递归引起的。在你的情况下,你在Crawl中无条件地调用Crawl(scrapedLinks)。不知道scrapedLinks应该做什么,但这就是原因。
答案 1 :(得分:1)
添加终端条件。如果不深入了解Crawl实际做的逻辑,也许这样简单的事情可以解决问题:
private void Crawl(List<String> links)
{
//////////////////////////////////
// Check for something to work on
if (links == null || links.Count == 0)
return; // Return if there is nothing to do.
//////////////////////////////////
List<String> scrapedLinks = new List<String>();
foreach (string link in links)
{
List<String> scrapedItems = Scrape(link, new Regex(iTalk_TextBox_Small2.Text), new WebClient());
foreach (string item in scrapedItems) listBox1.Invoke(new Action(delegate () { listBox1.Items.Add(item); }));
iTalk_Label4.Invoke(new Action(delegate () { iTalk_Label4.Text = "Scraped Items: " + listBox1.Items.Count; }));
if (scrapedItems.Count > 0 || !Properties.Settings.Default.Inspector)
{
foreach (string scrapedLink in Scrape(link, new Regex(@"https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)"), new WebClient()))
{
if(!Properties.Settings.Default.Blacklist.Contains(scrapedLink)) scrapedLinks.Add(scrapedLink);
}
scrapedLinksTotal += scrapedLinks.Count;
}
iTalk_Label5.Invoke(new Action(delegate () { iTalk_Label5.Text = "Scraped Links: " + scrapedLinksTotal; }));
}
Crawl(scrapedLinks);
}