我在HTML中获取表的值时遇到问题,因为它没有id
s。我需要获取第二列的所有值并将它们保存到数组中。我正在使用HtmlAgilityPack,选择节点时会出现问题:
Dim doc As HtmlDocument
Dim web As New HtmlWeb()
Dim str As String
doc = Web.Load("http://www.dietas.net/tablas-y-calculadoras/tabla-de-composicion-nutricional-de-los-alimentos/carnes-y-derivados/aves/pechuga-de-pollo.html#")
Dim nodes_filas As HtmlNode() = doc.DocumentNode.SelectNodes("//table[@id='']//tr").ToArray
Dim nodes_columnas As HtmlNode() = doc.DocumentNode.SelectNodes("//td").ToArray
For Each row As HtmlNode In nodes_filas
For Each column As HtmlNode In nodes_columnas
str = column.InnerHtml & vbCrLf
Next
Next
这是表格:
<table cellspacing="1" cellpadding="3" width="100%" border="0">
<tr>
<td colspan="2" style="font-size:13px;color:#55711C;padding-bottom:5px;">Aporte por ración</td>
</tr>
<tr style="background-color:#EBEBEB">
<td width="125">Energía [Kcal]</td>
<td class="td_right">145,00</td>
</tr>
<tr>
<td>Proteína [g]</td>
<td class="td_right">22,20</td>
</tr>
<tr style="background-color:#EBEBEB">
<td>Hidratos carbono [g]</td>
<td class="td_right">0,00</td>
</tr>
<tr>
<td>Fibra [g]</td>
<td class="td_right">0,00</td>
</tr>
<tr style="background-color:#EBEBEB">
<td>Grasa total [g]</td>
<td class="td_right">6,20</td>
</tr>
<tr>
<td>AGS [g]</td>
<td class="td_right">1,91</td>
</tr>
<tr style="background-color:#EBEBEB">
<td>AGM [g]</td>
<td class="td_right">1,92</td>
</tr>
<tr>
<td>AGP [g]</td>
<td class="td_right">1,52</td>
</tr>
<tr style="background-color:#EBEBEB">
<td>AGP /AGS</td>
<td class="td_right">0,79</td>
</tr>
<tr>
<td>(AGP + AGM) / AGS</td>
<td class="td_right"> 1,80</td>
</tr>
<tr style="background-color:#EBEBEB">
<td>Colesterol [mg]</td>
<td class="td_right">62,00</td>
</tr>
<tr>
<td>Alcohol [g]</td>
<td class="td_right">0,00</td>
</tr>
<tr style="background-color:#EBEBEB">
<td>Agua [g]</td>
<td class="td_right">71,60</td>
</tr>
</table>
答案 0 :(得分:0)
抱歉,我没有安装VB,但C#版本应该足以让你知道。你有td_right
类,你可以使用lambda或xpath来查询它。
我更喜欢lambda / linq版本,因为我熟悉linq,而且我不需要记住XPATH语法。
LAMBDA:
public static bool HasClass(this HtmlNode node, params string[] classValueArray)
{
var classValue = node.GetAttributeValue("class", "");
var classValues = classValue.Split(' ');
return classValueArray.All(c => classValues.Contains(c));
}
var url = "http://www.dietas.net/tablas-y-calculadoras/tabla-de-composicion-nutricional-de-los-alimentos/carnes-y-derivados/aves/pechuga-de-pollo.html#";
var htmlWeb = new HtmlWeb();
var htmlDoc = htmlWeb.Load(url);
var nodes = htmlDoc.DocumentNode.Descendants("td").Where(_ => _.HasClass("td_right")).Select(_ => _.InnerText);
XPATH:
var nodes2 = htmlDoc.DocumentNode.SelectNodes("//td[@class='td_right']");