我想在c#中使用简单的应用程序获取html页面的文本? 如果有嵌套元素 即,
<Table>
<TR>
<TD>**ABC**
</TD>
<TD>**1**
</TD>
</TR>
<TR>
<TD>**XYZ**
</TD>
<TD>**2**
</TD>
</TR>
</Table>
如何直接获取文本(粗体)值。我想将它们保存在我的数据库中,还想在gridview中显示?
HtmlDocument htmlSnippet = new HtmlDocument();
htmlSnippet = LoadHtmlSnippetFromFile();
private HtmlDocument LoadHtmlSnippetFromFile()
{
//TextReader reader = File.OpenText(Server.MapPath("~/App_Data/HtmlSnippet.txt"));
WebClient webClient = new WebClient();
const string strUrl = "http://www.dsebd.org/latest_PE_all2_08.php";
Stream reader = webClient.OpenRead(strUrl);
HtmlDocument doc = new HtmlDocument();
doc.Load(reader);
reader.Close();
return doc;
}
从htmlSnippet
我怎样才能获得价值?
答案 0 :(得分:1)
我不确定,你需要什么......根据你的例子,你想要一个字符串"**ABC****1****XYZ****2**"
吗?
然后这应该有效:htmlSnippet.Body.OuterText
编辑:好的,尝试单独值的示例......
HtmlElement tableElement = FindElement(HtmlDocument.Body, "table");
foreach(HtmlElement row in tableElement.Children)
{
if (row.Name.ToLower() == "tr")
{
// create whatever class you use for a row
foreach(HtmlElement cell in row.Children)
{
if (cell.Name.ToLower() == "td")
{
// add a new cell to your row using cell.InnerText
}
}
}
}
// *** snip ***
private HtmlElement FindElement(HtmlElement element, string name)
{
if (element.Name.ToLower() == name)
{
return element;
}
foreach(HtmlElement child in element.Children)
{
HtmlElement test = FindElement(test, name);
if (test != null)
{
return test;
}
}
return null;
}
抱歉,我现在没有Visual Studio来测试代码......祝你好运; - )