我正在使用HtmlAgilityPack从HTML代码源中提取数据。 这是HTML的一个例子:
<div class="enum-container">
<div class="enum">
<span class="field-key">MD5</span> a4188cf2b9189f82b855350233a307eb
</div>
<div class="enum">
<span class="field-key">SHA1</span> c3eedd67a14810b8c639eb77ed2731e574245b2a
</div>
<div class="enum">
<span class="field-key">File size</span>
3.8 KB ( 3854 bytes )
</div>
</div>
我使用此代码:
Dim Table2 As New DataTable()
Table2.Columns.Add("Value1", GetType(String))
Table2.Columns.Add("Value2", GetType(String))
For Each row1 As HtmlNode In doc.DocumentNode.SelectNodes("//div[@id='file-details']//div[@class='enum-container']//div[@class='enum']")
Dim MyValue1 As HtmlNode = row1.SelectSingleNode("//span[@class='field-key']")
Dim MyValue2 As String = row1.InnerText
Table2.Rows.Add(MyValue1.InnerText, MyValue2)
Next
DataGridView3.DataSource = Table2
结果如下:
http://i.stack.imgur.com/vPriY.png
您可以看到,第一列获得了重复值( MD5 )。
我想要的是这样的:
http://i.stack.imgur.com/jlsk5.png
谢谢。
答案 0 :(得分:0)
您正在选择文档中与“//”xpath匹配的文档中的第一个范围。你需要从你的xpath中删除它,所以它会选择直接的后代。
<强> C#强>
DataTable fileDetailsTable = new DataTable();
fileDetailsTable.Columns.Add("Key", typeof(string));
fileDetailsTable.Columns.Add("Value", typeof(string));
HtmlNodeCollection enumNodes = document.DocumentNode.SelectNodes("//div[@id='file-details']//div[@class='enum-container']//div[@class='enum']");
foreach (HtmlNode enumNode in enumNodes)
{
//Select the child span from the enum node.
HtmlNode fieldKeyNode = enumNode.SelectSingleNode("span[@class='field-key']");
if (fieldKeyNode != null)
{
//Grab the key.
string fieldKey = fieldKeyNode.InnerText;
//Grab the value which is the field key's sibling
string fieldValue = fieldKeyNode.NextSibling.InnerText;
fileDetailsTable.Rows.Add(fieldKey, fieldValue);
}
}
<强> VB.NET 强>
Dim fileDetailsTable As New DataTable()
fileDetailsTable.Columns.Add("Key", GetType(String))
fileDetailsTable.Columns.Add("Value", GetType(String))
Dim enumNodes As HtmlNodeCollection = document.DocumentNode.SelectNodes("//div[@id='file-details']//div[@class='enum-container']//div[@class='enum']")
For Each enumNode As HtmlNode In enumNodes
'Select the child span from the enum node.
Dim fieldKeyNode As HtmlNode = enumNode.SelectSingleNode("span[@class='field-key']")
If fieldKeyNode IsNot Nothing Then
'Grab the key.
Dim fieldKey As String = fieldKeyNode.InnerText
'Grab the value which is the field key's sibling
Dim fieldValue As String = fieldKeyNode.NextSibling.InnerText
fileDetailsTable.Rows.Add(fieldKey, fieldValue)
End If
Next