using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using HtmlAgilityPack;
namespace sss
{
public class Downloader
{
WebClient client = new WebClient();
public HtmlDocument FindMovie(string Title)
{
//This will be implemented later on, it will search movie.
}
public HtmlDocument FindKnownMovie(string ID)
{
HtmlDocument Page = (HtmlDocument)client.DownloadString(String.Format("http://www.imdb.com/title/{0}/", ID));
}
}
}
如何将下载的字符串转换为有效的HtmlDocument,以便我可以使用HTMLAgilityPack解析它?
答案 0 :(得分:6)
这适用于v1.4:
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(string.Format("http://www.imdb.com/title/{0}/", ID));
或
string html = client.DownloadString(String.Format("http://www.imdb.com/title/{0}/", ID));
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
答案 1 :(得分:4)
试试这个(基于this fairly old document):
string url = String.Format("http://www.imdb.com/title/{0}/", ID);
string content = client.DownloadString(url);
HtmlDocument page = new HtmlDocument();
page.LoadHtml(content);
基本上, 很少在两种类型之间进行转换的正确方式 - 特别是在进行解析时。
答案 2 :(得分:1)
以下代码行将使用您的内容创建HtmlDocument
:
// First create a blank document
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
// Then load it with the content from the webpage you are trying to parse
doc.Load(new StreamReader(WebRequest.Create("yourURL").GetResponse()
.GetResponseStream()));
答案 3 :(得分:0)
也许您可以在文件系统中创建一个新文件(.html),然后使用流编写器将字符串写入html文件。然后将该文件传递给解析器