C#HTML Parse of web page

时间:2013-12-06 07:31:26

标签: c# html html-parsing

我正在尝试解析html网页以从中获取信息。以下是来源的示例:

<div class="market_listing_row market_recent_listing_row listing_2107979855708535333" id="listing_2107979855708535333">

<div class="market_listing_item_img_container">     <img id="listing_2107979855708535333_image" src="asdgfasdfgasgasgdasgasdgsdasgsadg" style="border-color: #D2D2D2;" class="market_listing_item_img" alt="" />    </div>
        <div class="market_listing_right_cell market_listing_action_buttons">
                <div class="market_listing_buy_button">
                            <a href="javascript:BuyMarketListing('listing', '2107979855708535333', 570, '2', '508690045')" class="item_market_action_button item_market_action_button_green">
                <span class="item_market_action_button_edge item_market_action_button_left"></span>
                <span class="item_market_action_button_contents">
                    Buy Now                 </span>
                <span class="item_market_action_button_edge item_market_action_button_right"></span>
                <span class="item_market_action_button_preload"></span>
            </a>
                        </div>
        </div>
    <div class="market_listing_right_cell market_listing_their_price">
    <span>
                    <span class="market_listing_price market_listing_price_with_fee">
            0,03 p&#1091;&#1073;.           </span>
        <span class="market_listing_price market_listing_price_without_fee">
            0,01 p&#1091;&#1073;.           </span>
        <br/>
                </span>

基本上我需要获得

中包含的部分
<span class="market_listing_price market_listing_price_with_fee">
        0,03 p&#1091;&#1073;.           </span>

我试图使用HTMLAgiltiyPack,但似乎无法弄明白。

2 个答案:

答案 0 :(得分:1)

您可以使用HtmlAgilityPack

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var node = doc.DocumentNode
            .SelectSingleNode("//span[@class='market_listing_price market_listing_price_without_fee']");

var text = WebUtility.HtmlDecode(node.InnerText);

答案 1 :(得分:0)

我发现你不能只把URL放到doc.LoadHtml中。您必须使用HttpWebRequest和Response。