我一直在尝试使用scrapy从abbr标签获取文本,但是它什么也没有返回。 我想知道为什么大多数人通常在CSS中使用xpath选择器。
<div class="display-post-action">
<div class="display-post-vote"> <span class="like-score ">0</span> <a title="กดปุ่มนี้เพื่อบอกว่าเนื้อหานี้ดี (กดอีกทีเพื่อยกเลิก)" class="icon-heart-like " href="javascript:void(0);"><span></span></a>
</div>
<div class="display-post-emotion"> <a class="emotion-vote" href="javascript:void(0);"> <span class="icon icon-emotion"></span> <span class="emotion-score ">0</span> </a> </div>
<div class="display-post-avatar">
<a href="https://pantip.com/profile/5523231" target="_blank"> <img src="https://ptcdn.info/images/avatar_member_default.png"> </a>
<div class="display-post-avatar-inner"> <a class="display-post-name owner" id="5523231" target="_blank" href="https://pantip.com/profile/5523231">สมาชิกหมายเลข 5523231</a>
<br><!--div no. 184.22.107.146 / 184.22.107.146-->
<span class="display-post-timestamp" style="display:block"> <abbr title="11 ตุลาคม 2562 เวลา 11:52:39 น." data-utime="10/11/2019 11:52:39" class="timeago">11 ตุลาคม เวลา 11:52 น.</abbr>
<span class="display-post-ip unfocus-txt"> [IP: 184.82.25.133]</span>
</span> </div>
</div>
</div>
import scrapy
class QuotesSpider(scrapy.Spider):
name = "quotes"
start_urls = [
'https://pantip.com/topic/39308092',
]
def parse(self, response):
for quote in response.css('div.container-inner'):
yield {
'Main post' : quote.css('div.display-post-wrapper.main-post.type div.display-post-story::text').getall(),
'Owner' : quote.css('a.display-post-name.owner::text').getall(),
'Time' : quote.css('span.display-post-timestamp abbr.timeago::text').getall(),
}