我有以下HTML:这是文本格式的html文件 我是从本地硬盘读取的:
"<span style=""font-size:14px;""><span style=""""><strong>Description:</strong><br />
Material:Cotton+Polyester<br />
Color:White-Black<br />
Occasion: Casual<br /><br />
<strong>Details in size:</strong></span></span><br />
<div border=""1"" class=""tab02"" style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; text-align: center; font-size: 14px; font-family: Arial- Helvetica- sans-serif;"" width=""100%"">
<div>
<div style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px;"">
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
US Size</span>
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
M</span>
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
L</span>
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
XL</span>
</div>
<div style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px;"">
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
Asian Size</span>
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
L</span>
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
XL</span>
<span style=""border: 1px dashed rgb(204- 204- 204); border-collapse: collapse; border-spacing: 0px; padding: 5px 10px;"">
2XL</span>
</div>
我需要使用C#和Xpath获取innerDiv。 这就是我到目前为止所做的:我使用Xpath和
string SizeDescriptions = File.ReadAllText(@"E:\Elance\Product Description HTML\HTML_Product_Description.txt");
HtmlDocument document = new HtmlDocument();
string htmlString = SizeDescriptions;// "<html>blabla</html>";
document.LoadHtml(htmlString);
HtmlNodeCollection collection = document.DocumentNode.SelectNodes("//div").FindFirst("div").ChildNodes;
foreach (HtmlNode link in collection)
{
HtmlNodeCollection Sizes = link.SelectNodes("/div/span");
foreach(HtmlNode SizeDiv in Sizes)
{
TableRow tr1 = new TableRow();
TableCell cell1 = new TableCell();
tr1.
}
string target = link.Attributes["href"].Value;
}
答案 0 :(得分:0)
使用
HtmlNodeCollection innerDivs = document.DocumentNode.SelectNodes("//div/div");
foreach (HtmlNode div in innerDivs)
{
HtmlNodeCollection spans = link.SelectNodes("span");
foreach(HtmlNode span in spans)
{
string text = span.InnerText;
}
}
当然,如果跨度属于哪个div并不重要,那么只需使用一个XPath和foreach,例如。
HtmlNodeCollection spans = document.DocumentNode.SelectNodes(“// div / div / span”);
foreach(HtmlNode span in spans)
{
string text = span.InnerText;
}