在ASP.NET MVC 4中从爬网程序中检索数据

时间:2015-12-24 09:06:00

标签: asp.net-mvc-4 web-crawler

我有来自网页的抓取数据。我想从HTML代码中获取一些数据,我该怎么做?

抓取数据的一些示例:

<table width="272" cellpadding="5" cellspacing="0" class="floatleft" style="border-right:1px dotted #BFBFBF"> 
<tr> 
    <td width="130" class="paymentlabel"><strong>Transfer Fee</strong></td> 
    <td width="120"> $11 </td> 
    <td><a href="popups/whatsTDIT.php?type=1" class="whatsTDIT"><img src="http://i.i-sgcm.com/used_cars/qmark_red_18x18.png" width="18" height="18" /></a></td> 
</tr> 
<tr> 
    <td class="paymentlabel" valign="top"><strong>Down Payment</strong></td> 
    <td> $47,444 (<a href="info_financial.php?ID=526494">change</a>) <p style="line-height:14px;"><span class="font_gray">Maximum 50% Loan</span></p> </td> 
    <td><a href="popups/whatsTDIT.php?type=2" class="whatsTDIT"><img src="http://i.i-sgcm.com/used_cars/qmark_red_18x18.png" width="18" height="18" /></a></td> 
</tr> 
<tr> 
    <td class="paymentlabel"><strong>1st Instalment</strong></td> 
    <td> $901 </td> 
    <td><a href="popups/whatsTDIT.php?type=3" class="whatsTDIT"><img src="http://i.i-sgcm.com/used_cars/qmark_red_18x18.png" width="18" height="18" /></a></td> 
</tr> 
<tr bgcolor="#FFF4F4"> 
    <td class="paymentlabel" valign="top"><strong>Total</strong></td> 
    <td valign="top"> <strong class="font_red"> $48,356<br /><span class="font_gray" style="font-weight:normal;">(excluding insurance)</span> </strong> </td> 
    <td><a href="popups/whatsTDIT.php?type=4" class="whatsTDIT"><img src="http://i.i-sgcm.com/used_cars/qmark_red_18x18.png" width="18" height="18" /></a></td> 
</tr> 
<tr> 
    <td class="font_gray font_10" colspan="3" style="padding:7px 0 7px 5px; line-height:12px;">Estimates based on 50% loan at 2.80% interest rate. <br />Check with seller for exact figure.</td> 
</tr> 
</table>

履带式控制器:

HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.UserAgent = "A .NET Web Crawler";
WebResponse response = request.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream);
string htmlText = reader.ReadToEnd();
return htmlText;

我想获得转账费(即11美元),预付定金额(即47,444美元),第一期分期付款(即901美元)和总价值(即48,356美元)。有可能吗?谢谢你的帮助!

1 个答案:

答案 0 :(得分:0)

是的,你可以。你需要一个库来加载html,并允许你查询你需要的东西。尝试下列其中一项或全部内容,以便您完全按照这样做...

CsQuery

Anglesharp

Html Agility Pack