Question

以下是我为抓取http://stock.hankyung.com/news/app/newslist.php?cid=01

所做的工作

我附上了我的文件，因为它在本网站上取得了怎样的胜利？

正如你所看到的，这只会刮掉新闻的第一页，现在我想做的是如何从第1页到最后（43199）抓取新闻，但我不知道从哪里开始我是非常初学者几个星期前我开始使用c＃，但我对编码非常感兴趣，我想向c＃的大师学习。抱歉我的英语不好。

Answer 1

您在寻呼机中的当前页面是一个跨度而非链接，所以这就是我要做的事情。首先在底部找到寻呼机：

public bool GoToNextPage()
{

IWebElement pager = driver.FindElement(By.Xpath("descendant::div[@class='paging']"));

//Then find the current page:
List<IWebElement> nextPage = pager.FindElements(By.Xpath("descendant::span\following-sibling::a")).ToList();
if(nextPage.Count > 0)
{
nextPage.First().Click();
return true;
}
else{
//you're at the end of the pages
return false;
}
}

然后我按照这样的方式构建程序：

bool hasNextPage = true;
while(hasNextPage)
{
//your existing scrape functionality here
hasNextPage = GoToNextPage();
}

c＃webcrawling初学者使用visual studio有关于添加链接的问题

1 个答案: