我有一些像这样的内容的字符串
<a href="http://example.com/2014/06/22/new-idea-about-life.zip">One</a>
<a href="http://example.com/2014/06/22/new-idea-about-life-rar.rar">Two</a>
我需要这个输出:
http://example.com/2014/06/22/new-idea-about-life.zip
http://example.com/2014/06/22/new-idea-about-life-rar.rar
答案 0 :(得分:0)
HTML Agility Pack是一个很好的库来解析C#中的HTML。
提取网址的示例是:
var html = "<a href=\"http://reallife.com/2014/06/22/new-idea-about-life.zip\">New idea about life (zip) (25MB)</a><a href=\"http://reallife.com/2014/06/22/new-idea-about-life-rar.rar\">New idea about life (rar) (23MB)</a>
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var links = new List<string>();
foreach (var link in htmlDoc.DocumentNode.SelectNodes("//a[@href]"))
{
links.Add(link.GetAttributeValue("href", string.Empty));
}
// do something with the links inside the links-List