我正在使用C#和Htmlagilitypack从网站上抓取数据。 我几乎得到了我想要的东西,请看截图:
这是我的HTML代码:
<tr class="gtitle"><th colspan="11" class="nobr">Next matches</th></tr>
<tr class="rtitle first-row"><th class="left first-cell nobr"> </th><th class="nobr"> </th><th class="nobr"> </th><th> </th><th class="bs" title="Number of Bookies">B's</th><th>1</th><th>X</th><th>2</th><th class="col-time nobr"> </th><th class="space nobr"> </th><th class="col-score last-cell nobr"> </th></tr>
<tr class="match-line first-row"><td class="tl first-cell nobr match-day-after" title="Day after tomorrow match"></td><td class="tl nobr"><a href="/soccer/norway/tippeligaen/valerenga-bodo-glimt/KdDafkm4/">Valerenga - Bodo/Glimt</a></td><td class="livebet nobr"> </td><td class="tv"> </td><td class="bs">25</td><td class="odds"><span><a href="/my_selections.php?action=3&matchid=KdDafkm4&outcomeid=2aeqnxv464x0x4s2rj&otheroutcomes=2aeqnxv498x0x0,2aeqnxv464x0x4s2rk" onclick="return my_selections_click(this);" title="Add to My Selections" target="mySelections" class="mySelectionsTip" data-odd="1.52"></a></span></td><td class="odds"><span><a href="/my_selections.php?action=3&matchid=KdDafkm4&outcomeid=2aeqnxv498x0x0&otheroutcomes=2aeqnxv464x0x4s2rj,2aeqnxv464x0x4s2rk" onclick="return my_selections_click(this);" title="Add to My Selections" target="mySelections" class="mySelectionsTip" data-odd="4.09"></a></span></td><td class="odds"><span><a href="/my_selections.php?action=3&matchid=KdDafkm4&outcomeid=2aeqnxv464x0x4s2rk&otheroutcomes=2aeqnxv464x0x4s2rj,2aeqnxv498x0x0" onclick="return my_selections_click(this);" title="Add to My Selections" target="mySelections" class="mySelectionsTip" data-odd="5.80"></a></span></td><td class="last-cell nobr right" colspan="3">19.08.2016 19:00</td></tr>
<tr class="match-line strong"><td class="tl first-cell nobr"></td><td class="tl nobr"><a href="/soccer/norway/tippeligaen/lillestrom-haugesund/0htZwU2N/">Lillestrom - Haugesund</a></td><td class="livebet nobr"> </td><td class="tv"> </td><td class="bs">24</td><td class="odds"><span><a href="/my_selections.php?action=3&matchid=0htZwU2N&outcomeid=2aeqhxv464x0x4s2r7&otheroutcomes=2aeqhxv498x0x0,2aeqhxv464x0x4s2r8" onclick="return my_selections_click(this);" title="Add to My Selections" target="mySelections" class="mySelectionsTip" data-odd="2.34"></a></span></td><td class="odds"><span><a href="/my_selections.php?action=3&matchid=0htZwU2N&outcomeid=2aeqhxv498x0x0&otheroutcomes=2aeqhxv464x0x4s2r7,2aeqhxv464x0x4s2r8" onclick="return my_selections_click(this);" title="Add to My Selections" target="mySelections" class="mySelectionsTip" data-odd="3.40"></a></span></td><td class="odds"><span><a href="/my_selections.php?action=3&matchid=0htZwU2N&outcomeid=2aeqhxv464x0x4s2r8&otheroutcomes=2aeqhxv464x0x4s2r7,2aeqhxv498x0x0" onclick="return my_selections_click(this);" title="Add to My Selections" target="mySelections" class="mySelectionsTip" data-odd="2.83"></a></span></td><td class="last-cell nobr right" colspan="3">20.08.2016 15:30</td></tr>
我的问题是两个: 1)我应该将MatchNM列中的数据拆分为HomeNM和HostNM 2)我应该从note属性“data-odd”获取值,并将它们放入odd1NM,oddXNM和odd2NM。
这是我写的代码:
Form1中:
var url1 = "http://www.betexplorer.com/soccer/norway/tippeligaen/";
var web1 = new HtmlWeb();
var doc1 = web1.Load(url1);
BetsNM = new List<NextMatch>();
// Lettura delle righe
var Rows = doc1.DocumentNode.SelectNodes("//tr");
foreach (var row in Rows)
{
if (!row.GetAttributeValue("class", "").Contains("rtitle"))
{
if (string.IsNullOrEmpty(row.InnerText))
continue;
var rowBetNM = new NextMatch();
foreach (var node in row.ChildNodes)
{
var data_odd1 = node.GetAttributeValue("data-odd", "");
if (string.IsNullOrEmpty(data_odd1))
{
if (node.GetAttributeValue("class", "").Contains("tl"))
{
rowBetNM.MatchNM = node.InnerText.Trim();
var matchTeamNM = rowBetNM.MatchNM.Split(new[] { " - " }, StringSplitOptions.RemoveEmptyEntries);
//rowBetNM.HomeNM = matchTeamNM[0];
//rowBetNM.HostNM = matchTeamNM[1];
}
if (node.GetAttributeValue("class", "").Contains("last-cell"))
rowBetNM.DateNM = node.InnerText.Trim();
}
else
{
rowBetNM.OddsNM.Add(data_odd1);
}
}
if (!string.IsNullOrEmpty(rowBetNM.MatchNM ))
BetsNM.Add(rowBetNM);
}
}
NextMatch.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace bexscraping
{
class NextMatch
{
public string MatchNM { get; set; }
public List<string> OddsNM { get; set; }
public string DateNM { get; set; }
public string HomeNM { get; set; }
public string HostNM { get; set; }
public string odd1NM { get; set; }
public string oddXNM { get; set; }
public string odd2NM { get; set; }
public NextMatch()
{
OddsNM = new List<string>();
}
public override string ToString()
{
String MatchInfo = String.Format("{0}: {1} -> {2}", DateNM, MatchNM);
String OddsInfo = String.Empty;
foreach (string d in OddsNM)
OddsInfo += " | " + d;
return MatchInfo + "\n" + OddsInfo;
}
}
}
我真的不明白问题出在哪里。有人可以帮帮我吗?谢谢!
编辑再次检查我的帖子,我做了一些更正