Port BeautifulSoup到HtmlAgility

时间:2017-04-22 18:44:42

标签: c# html-agility-pack

我正在尝试将此程序移植到pythonC#

from __future__ import print_function

import requests
from bs4 import BeautifulSoup

r = requests.get('http://www.forexfactory.com/calendar.php?day=nov18.2016')
soup = BeautifulSoup(r.text, 'lxml')

tables = soup.findAll("table", {'class':'calendar__table'})

for table in tables:
    for row in table.findAll("tr"):
        for cell in row.findAll("td"):
            print (cell.text, end = " ")
        print()

这是C#使用HtmlAgilityPack的[代码片段]尝试,但它不起作用:

HtmlWeb browser = new HtmlWeb();
string URI = "http://www.forexfactory.com/calendar.php?day=nov18.2016";

ServicePointManager.ServerCertificateValidationCallback += (sender, cert, chain, sslPolicyErrors) => true;
ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3 | SecurityProtocolType.Tls | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls12;

HtmlDocument document = browser.Load(URI);

foreach (HtmlNode row in document.DocumentNode.Descendants("table").FirstOrDefault(_ => _.Id.Equals("calendar__table")).Descendants("tr"))
    Console.WriteLine(row);

1 个答案:

答案 0 :(得分:1)

您可以使用此代码

查询id和单个节点
document.DocumentNode.SelectSingleNode("//table[@id='calendar_table']").Descendants("tr");

但我想你需要按类查询,而不是按id查询,所以代码看起来像这样

document.DocumentNode.SelectSingleNode("//table[@class='calendar_table']").Descendants("tr");

在python代码中,类名称有两个__符号,但在c#代码中有一个 - _