Question

我想知道如何从网页上获取数据

示例：

<li id="hello1">about me
    <ul class="square">
        <li><strong>name: john</strong></li>
    </ul>
</li>

我想在名字前面读约翰：所以我怎么不能在c＃中读到它哦，我曾尝试使用HTML Agility Pack :(但由于文档很差，我无法使用，所以需要帮助。

Answer 1

使用HtmlAgilityPack

HtmlDocument doc = new HtmlDocument();
doc.Load(yourStream);
var nameElement= doc.DocumentNode.SelectSingleNode("//li[@id='hello1']").InnerText;
//name would contain `about me name: john`
Regex.Match(nameElement,@"(?<=name:\s*)\w+").Value;//john

Answer 2

之前我使用过HTML Agility Pack，这是一款很棒的工具

HtmlDocument document = new HtmlDocument(); 

document.LoadHtml(YourHTML);
var collection = document.DocumentNode.SelectNodes("//li[@id='hello1']");

如何在c＃中从网页中查找和提取文本

2 个答案: