Question

我正在尝试从here（表格中的冠军链接标题）获取所有名称的列表，但我没有成功..任何人都可以指导我这个代码有什么问题吗？

谢谢！

var url = "http://leagueoflegends.wikia.com/wiki/List_of_champions";
var web = new HtmlWeb();
var doc = web.Load(url);            

foreach (HtmlNode table in doc.DocumentNode.SelectNodes("//table[3]/tr"))
{
    HtmlNode item = table.SelectSingleNode("//a");
    Console.WriteLine(item.GetAttributeValue("title", false));
}

更新

好吧，我用这段代码就可以正常工作了：

var url = "http://leagueoflegends.wikia.com/wiki/List_of_champions";
var web = new HtmlWeb();
var doc = web.Load(url);            

foreach (HtmlNode item in doc.DocumentNode.SelectNodes("//table[3]/tr/td/span/a"))
{
    Console.WriteLine(item.Attributes["title"].Value);
}

return true;

感谢您的帮助！

Answer 1

请以这种方式使用xpath

foreach (HtmlNode linkItem in doc.DocumentNode.SelectNodes("//table[3]/tr//a"))
{
    Console.WriteLine(linkItem.Attributes["title"].Value());
    Console.WriteLine(linkItem.Attributes["alt"].Value());
}

Answer 2

我敲了一个快速而肮脏的例子，经过测试并且完美无瑕地工作，你会想要将结果格式化一点：

protected void Page_Load(object sender, EventArgs e)
{
      List<HtmlAgilityPack.HtmlNode> test = GetInnerTest();

      foreach (var node in test)
      {
            Response.Write("Result: " + node.InnerHtml.ToString());
      }

}

public List<HtmlAgilityPack.HtmlNode> GetInnerTest()
{
     HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

     doc.OptionFixNestedTags = true;
     doc.Load(requestData("http://leagueoflegends.wikia.com/wiki/List_of_champions"));

     var node = doc.DocumentNode.Descendants("span").Where(d => d.Attributes.Contains("class") && d.Attributes["class"].Value.Contains("character_icon")).ToList();

     return node;
}


public StreamReader requestData(string url)
{
       HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
       HttpWebResponse resp = (HttpWebResponse)req.GetResponse();

       StreamReader sr = new StreamReader(resp.GetResponseStream());

       return sr;
}

您需要下载HtmlAgilityPack并提供相关的参考资料。

我怎样才能获得所有链接标题？

2 个答案: