如何在网页上使用LINQ解析HTML以从表中获取innerhtml
值?
我正在使用HtmlAgilityPack
,并希望尽可能好地解析一些值。
您看到的号码(00000,00001,00002 ..)是代理商的唯一号码。
所以也许有一种方法可以使用LINQ来解析这些数字并从td's
获得以下值
(姓名,123,州和信息)=> 00000,约翰,123,IDLE,咖啡 所以我可以单独调用它们并与它们一起工作 - 也许是在数组中?
</TH>
</TR>
<TR ALIGN=RIGHT>
<TD ALIGN=LEFT>00000</TD>
<TD ALIGN=LEFT>John</TD>
<TD ALIGN=CENTER>123</TD>
<TD ALIGN=LEFT>IDLE</TD>
<TD ALIGN=LEFT>coffee</TD>
</TR>
<TR ALIGN=RIGHT>
<TD ALIGN=LEFT>00001</TD>
<TD ALIGN=LEFT>Lisa</TD>
<TD ALIGN=CENTER>123</TD>
<TD ALIGN=LEFT>IDLE</TD>
<TD ALIGN=LEFT>coffee</TD>
</TR>
<TR ALIGN=RIGHT>
<TD ALIGN=LEFT>00002</TD>
<TD ALIGN=LEFT>Mary</TD>
<TD ALIGN=CENTER>123</TD>
<TD ALIGN=LEFT>IDLE</TD>
<TD ALIGN=LEFT>coffee</TD>
</TR>
<TR ALIGN=RIGHT>
<TD ALIGN=LEFT>00003</TD>
<TD ALIGN=LEFT>Tim</TD>
<TD ALIGN=CENTER>123</TD>
<TD ALIGN=LEFT>IDLE</TD>
<TD ALIGN=LEFT>coffee</TD>
</TR>
....
提前致谢!
答案 0 :(得分:1)
这看起来很像“请给我代码我需要问题”,我非常不喜欢。看看以下内容并确保您理解它:
var doc = ... // Load the document
var trs = doc.DocumentNode.Descendants("TR"); // Give you all the TRs
foreach (var tr in trs)
{
var tds = tr.Descendants("TD").ToArray(); // Get all the TDs
// Turn them into our datastructure
var data = new {
Name = tds[1].InnerText,
Number = tds[2].InnerText,
State = tds[3].InnerText,
Info = tds[4].InnerText,
};
// Do something with data
}
仅使用LINQ:
var data = from tr in doc.DocumentNode.Descendants("TR")
let tds = tr.Descendants("TD").ToArray()
select new {
Name = tds[1].InnerText,
Number = tds[2].InnerText,
State = tds[3].InnerText,
Info = tds[4].InnerText,
};
答案 1 :(得分:0)
@flindeberg给出了一个非常合理的答案(对他/她来说是+1),你可以避免这样的ToArray
。
private class Row
{
public string Name { get; set; }
public int Number { get; set; }
public string State { get; set; }
public string Info { get; set; }
}
...
var mappings = new Action<string, Row>[]
{
(value, row) => row.Name = value,
(value, row) => row.Number = int.Parse(value),
(value, row) => row.State = value,
(value, row) => row.Info = value
};
var doc = ... // Load the document
var trs = doc.DocumentNode.Descendants("TR"); // Give you all the TRs
foreach (var tr in trs)
{
var row = new Row();
tr.Descendants("TD").Zip(mappings, (td, map) =>
{
map(td.InnerText, row);
return true;
});
// You now have a populated row.
}