我需要网页上的一些值,所以我正在使用html敏捷包构建一个抓取。
我会告诉你html网站和我的Csharp。
Html网站:
<div class="box-overflow">
<div class="box-overflow__in">
<table class="table-main js-tablebanner-t js-tablebanner-ntb">
<tr>
<th class="h-text-left" colspan="2">17. Round</th>
<th class="h-text-center">1</th>
<th class="h-text-center">X</th>
<th class="h-text-center">2</th>
<th> </th>
</tr>
<tr>
<td class="h-text-left"><a href=
"/soccer/poland/ekstraklasa/lechia-gdansk-leczna/Kjnscb6D/" class=
"in-match"><span>Lechia Gdansk</span> - <span>Leczna</span></a></td>
<td class="h-text-center"><a href=
"/soccer/poland/ekstraklasa/lechia-gdansk-leczna/Kjnscb6D/">3:0</a></td>
<td class="table-matches__odds colored"></td>
<td class="table-matches__odds" data-odd="4.04"></td>
<td class="table-matches__odds" data-odd="6.29"></td>
<td class="h-text-right h-text-no-wrap">28.11.2016</td>
</tr>
<tr>
<td class="h-text-left"><a href=
"/soccer/poland/ekstraklasa/plock-piast-gliwice/KrhILsqE/" class=
"in-match"><span>Plock</span> - <span>Piast Gliwice</span></a></td>
<td class="h-text-center"><a href=
"/soccer/poland/ekstraklasa/plock-piast-gliwice/KrhILsqE/">0:0</a></td>
<td class="table-matches__odds" data-odd="2.05"></td>
<td class="table-matches__odds colored"></td>
<td class="table-matches__odds" data-odd="3.50"></td>
<td class="h-text-right h-text-no-wrap">27.11.2016</td>
</tr>
<tr>
<td class="h-text-left"><a href=
"/soccer/poland/ekstraklasa/slask-wroclaw-legia/bZjMK1bK/" class=
"in-match"><span>Slask Wroclaw</span> - <span>Legia</span></a></td>
<td class="h-text-center"><a href=
"/soccer/poland/ekstraklasa/slask-wroclaw-legia/bZjMK1bK/">0:4</a></td>
<td class="table-matches__odds" data-odd="4.53"></td>
<td class="table-matches__odds" data-odd="3.64"></td>
<td class="table-matches__odds colored"></td>
<td class="h-text-right h-text-no-wrap">27.11.2016</td>
</tr>
</table>
</div>
</div>
我的csharp:
var url = "http://www.betexplorer.com/soccer/poland/ekstraklasa/results/";
var web = new HtmlWeb();
var doc = web.Load(url);
Bets = new List<Bet>();
// Lettura delle righe
var Rows = doc.DocumentNode.SelectNodes("//table");
foreach (var row in Rows)
{
if (!row.GetAttributeValue("class", "").Contains("table-main js-tablebanner-t js-tablebanner-ntb"))
{
if (string.IsNullOrEmpty(row.InnerText))
continue;
var rowBet = new Bet();
foreach (var node in row.ChildNodes)
{
var data_odd = node.GetAttributeValue("data-odd", "");
if (string.IsNullOrEmpty(data_odd))
{
if (node.GetAttributeValue("class", "").Contains("in-match"))
{
rowBet.Match = node.InnerText.Trim();
var matchTeam = rowBet.Match.Split(new[] { " - " }, StringSplitOptions.RemoveEmptyEntries);
rowBet.Home = matchTeam[0];
rowBet.Host = matchTeam[1];
}
if (node.GetAttributeValue("class", "").Contains("h-text-center"))
{
rowBet.Result = node.InnerText.Trim();
var matchPoints = rowBet.Result.Split(new[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
int help;
if (int.TryParse(matchPoints[0], out help))
{
rowBet.HomePoints = help;
}
if (matchPoints.Length == 2 && int.TryParse(matchPoints[1], out help))
{
rowBet.HostPoints = help;
}
}
if (node.GetAttributeValue("class", "").Contains("h-text-right h-text-no-wrap"))
rowBet.Date = node.InnerText.Trim();
}
else
{
rowBet.Odds.Add(data_odd);
}
}
if (!string.IsNullOrEmpty(rowBet.Match))
Bets.Add(rowBet);
}
}
我会给你更多的信息:
I need to take teams name (e.g. Lechia Gdansk - Leczna),
result (e.g. 3:0)
data-odd (e.g. 1.49, 4.04, 6.29)
and match date (e.g. 28.11.2016)
如果有人需要更多的信息,请问我想知道什么。感谢
答案 0 :(得分:1)
我会这样做
var list = doc.DocumentNode.SelectSingleNode("//table[@class='table-main js-tablebanner-t js-tablebanner-ntb']")
.Descendants("tr")
.Select(x => new
{
Val1 = x.SelectSingleNode("td[@class='h-text-left']")?.InnerText,
Val2 = x.SelectSingleNode("td[@class='h-text-center']")?.InnerText
})
.Where(x => x.Val1!=null)
.ToList();