XPATH - 从平面html结构中的div中获取信息

时间:2013-12-06 02:57:39

标签: html xpath

我真的很难在这个循环中从div中获取信息。任何人都可以对此有所了解吗?目前我正在使用以下语境:

// li [@ class ='review_tr']

以下内容正常:

//span[@id='hp_hotel_name'] - hotel name
.//p[@class='comments_good'] - positive review
.//p[@class='comments_bad'] - negative review

但我无法抓住任何其他领域,例如:

<div class="name"> - reviewer name
<div class="location"> - reviewer location
<div class="user_type"> - user type.

有谁知道如何抓住这三个领域?

提前感谢您的帮助。

<li id="747282424" class=" review_tr">
<div class="cell_user"
alt="hotel Review - Mature couple"
>
<div class="user_profile_wrapper"><div class="user_profile_avatar">
<div class="review_avatar mature couple">
<!--

--> 

</div>
</div></div> 
<div class="user_profile">
<div class="name">
Anonymous
</div>
<div class="user_type">
Mature couple
</div>
<div class="location">

Switzerland
</div>
<div class="date">24 March 2013</div>
</div>

</div>
<div class="speech_bubble_container"><div class="speech_bubble">
<div class="cell_comments">
<div id="area_comments_747282424">
<p class="comments_good" lang="en">
Located at 5 minutes walk from Petronas Towers and with many bars and
restaurant around. Very large room with all the needed comfort. Fast and
free wifi. Free parking. Free toproof pool with great view.
</p>
<!-- Start review_no_thumbs.inc -->
<div class="no_thumbs">

<span class="vote_copy">
Did you find this review helpful?
</span>
<span class="review_feedback">
<form class="review_useful_form" action="/feedback?type=review_feedback&amp;comment=1&amp;object_id=747282424&amp;hotel_id=175845" method="post">
<!-- 175845 -->
<!-- 175845 -->
<button class="review_no_thumbs_yes"  type="submit">yes</button>
</form>
<form class="review_useful_form" action="/feedback?type=review_feedback&amp;comment=0&amp;object_id=747282424&amp;hotel_id=175845" method="post">
<button class="review_no_thumbs_no"  type="submit">no</button>
</form>
</span>

<div style="clear:both"></div>
</div>

<!-- End review_no_thumbs.inc -->
</div>
</div>
<div class="cell_score">
<span class="the_score">9.6</span>
</div>
</div></div>
<hr class="clearfix" />
</li>

1 个答案:

答案 0 :(得分:0)

您应该应用与获得正面和负面评论相同的逻辑:

您可以将div中的任何内容与课程name匹配:

.//div[@class='name']

或者更详细地浏览它,以便您不会获得divname cell_user,例如div ./div[@class='cell_user']/div[@class='user_profile']/div[@class='name'] 之外的location

user_type

获取div和{{1}} {{1}}应该是一项微不足道的任务。