无法使用xpath获取span标记

时间:2013-03-21 10:28:49

标签: php html dom xpath

<div class="product_box">
    <div class="list_sale">
        <img src="link" class="listsale" alt="">
            <div class="product_box_title">
                <a href="link"><strong>title here</a></strong>
            </div>
            <div class="product_box_desc">
                some text here
                <strike>some text</strike> 
                <br />
                <span class="list_price">THIS IS THE NEEDED TEXT</span>
                <a href="link"><strong>some text</strong></a>
            </div>
            <div class="list_buynow">
            <form action="link" class="add_to_cart" method="post">
                <div class="add_cart">
                    <input type="image" src="link" value="add_to_cart" class="add_button">
                    <input id="fast_order_0_item_code" type="hidden" name="fast_order[0] [item_code]" value="value" class="item_code"/>
                    <input name="fast_order[0][add]" value="1" class="add_qty">
                    <input type="hidden" name="redirect_uri" value="value">
                </div>
            </form>
        </div>
       <div class="product_box_img">
           <a href="link">
               <a href="link"><img src="http://stacktoheap.com/images/stackoverflow.png" alt=""></a>
           </a>  
       </div>
   </div>
</div>

这是我的html文件,从这个div我需要提取“这是需要的文本”。我已经能够通过类“product_box_desc”得到div,从中我可以得到它下面的文本“这里有一些文字”。但我无法获取包含文本的范围。这是我正在使用的XPATH查询,请建议必须更改的内容。

$dom_xpath->query("//div[@class='product_box']/div/div[@class='product_box_desc']/span[@class='list_price']")

1 个答案:

答案 0 :(得分:1)

此查询适用于我:

//div[@class="product_box"]/div[@class="list_sale"]/div[@class="product_box_desc"]/span[@class="list_price"]

但是我将html更改为这个:

<div class='product_box'>
    <div class='list_sale'>
        <img src='link' class='listsale' alt='' />
        <div class='product_box_title'>
            <a href='link'><strong>title here</strong></a>
        </div>
        <div class='product_box_desc'>
            some text here
            <strike>some text</strike> 
            <br />
            <span class='list_price'>THIS IS THE NEEDED TEXT</span>
            <a href='link'><strong>some text</strong></a>
        </div>
        <div class='list_buynow'>
        <form action='link' class='add_to_cart' method='post'>
            <div class='add_cart'>
                <input type='image' src='link' value='add_to_cart' class='add_button'>
                <input id='fast_order_0_item_code' type='hidden' name='fast_order[0] [item_code]' value='value' class='item_code'/>
                <input name='fast_order[0][add]' value='1' class='add_qty'>
                <input type='hidden' name='redirect_uri' value='value'>
            </div>
        </form>
        </div>
       <div class='product_box_img'>
           <a href='link'>
               <img src='http://stacktoheap.com/images/stackoverflow.png' alt=''>
           </a>  
       </div>
   </div>
</div>

这里有一个错误:
<a href='link'><strong>title here</strong></a>而非 <a href='link'><strong>title here</a></strong>

我这样做:

$nodes = ($xPath->query('//div[@class="product_box"]/div[@class="list_sale"]/div[@class="product_box_desc"]/span[@class="list_price"]'));

foreach($nodes as $node) {
    echo $node->textContent;
}