WebClient webClient = new WebClient();
string page = webClient.DownloadString(
"http://www.deu.edu.tr/DEUWeb/Guncel/v2_index_cron.html");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
我想解析上面给出的页面,但我想得到表格的行信息。我试过几个例子,但我无法做到这一点。任何建议
答案 0 :(得分:2)
例如,您可以像这样解析行:
using System.Net;
using HtmlAgilityPack;
namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
WebClient webClient = new WebClient();
string page = webClient.DownloadString("http://www.deu.edu.tr/DEUWeb/Guncel/v2_index_cron.html");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(page);
HtmlNode table = doc.DocumentNode.SelectSingleNode("//table");
foreach (var cell in table.SelectNodes("tr/td"))
{
string someVariable = cell.InnerText;
}
}
}
}
为了完整性,使用LINQ可以轻松创建包含所有非空行值的枚举:
private static void Main(string[] args)
{
WebClient webClient = new WebClient();
string page = webClient.DownloadString("http://www.deu.edu.tr/DEUWeb/Guncel/v2_index_cron.html");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(page);
HtmlNode table = doc.DocumentNode.SelectSingleNode("//table");
var rows = table.SelectNodes("tr/td").Select(cell => cell.InnerText).Where(someVariable => !String.IsNullOrWhiteSpace(someVariable)).ToList();
}
答案 1 :(得分:1)
这是一个枚举所有表格单元格并将每个内部文本写入控制台的示例
WebClient webClient = new WebClient();
var page = webClient.DownloadString("http://www.deu.edu.tr/DEUWeb/Guncel/v2_index_cron.html");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
foreach (var td in doc.DocumentNode.SelectNodes("//table/tr/td"))
{
Console.WriteLine(td.InnerText);
}