抓取网站Python时如何跳过标签下的项目

时间:2020-01-08 13:37:30

标签: python web-scraping beautifulsoup

我正在尝试从示例网络论坛中提取评论。在提取注释时,我想消除那些在“报价”标签下的注释。我该如何忽略并提取其他人,因为两者都在同一个班级下

url = "https://www.f150forum.com/f118/would-you-buy-f150-again-463954/index3/"
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.text, 'html.parser')
domains = soup.find_all("div")
posts = soup.find(id = "posts")
comments_class = soup.findAll('div',attrs={"class":"ism-true"})   
comments = [row.get_text() for row in comments_class]

谢谢。

3 个答案:

答案 0 :(得分:1)

从该$cfg['Servers'][$i]['AllowNoPassword'] = True; 开始,该过滤器将过滤没有url标签的评论:

Quote

答案 1 :(得分:0)

问题出在您的选择器上,您需要对其进行更新,以避免div中包含文本为label的{​​{1}}。使用Quote

更容易完成

答案 2 :(得分:0)

当该元素出现时,获取下一个第三个div元素:

import requests
from bs4 import BeautifulSoup
import re


url = "https://www.f150forum.com/f118/would-you-buy-f150-again-463954/index3/"
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
domains = soup.find_all("div")
posts = soup.find(id = "posts")
comments_class = soup.findAll('div',attrs={"class":"ism-true"})   

comments_without_quote = []
for each in comments_class:
    if 'Quote' in each.text:
        nextComment = each.find('div', text=re.compile(r'Quote:')).findNext('div').findNext('div').findNext('div')
        comments_without_quote.append(nextComment.text.strip())

    comments_without_quote.append(each.text.strip())

输出:

print (comments_without_quote)
["I would say it depends.  I really miss my 2014, I think that generation of truck was very well sorted out.  beyond that I don't know.  I am NOT buying a chevy or a ram.  That leaves toyota.  which holy crap that truck is ancient.  I would just have to drive one.  I really don't want to leave ford but this generation of truck is not proving to be the most reliable for me.   I trade my trucks at 100k miles.  I also don't believe that when toyota updates their truck next year that I want to be the guinea pig there either.", "Maybe the question should be...would you buy the same F150 again!\n\nI'm leaning towards an F250 with diesel.\n\nRealistically what are the options?\n\nFord\nGM/Chevy\nRam\nToyota\nNissan\n\nThats it! Unlike say SUVs where you may have 25 to choose from...full size trucks have five.", 'Sure would!', 'I am going to take a long hard look at Ram if they end up coming out with the rumored turbo\'d Inline 6 motor. I have had 4-5 Rams over the last few years as rentals and put a couple thousand miles on them combined and they are very appealing trucks. The 8 speed is fantastic, the ride is better than any stock F150 I have been in and, as you mentioned, they are better optioned for the price than an F150. Most of the ones I have had were BigHorns and they were all nicely optioned. I did have a 2019 Ram Rebel in San Antonio and that was a cool rig with the factory lift and 33" duratracs.', 'Quote:\n\n\r\n\t\t\tOriginally Posted by mass-hole\n\n\nI am going to take a long hard look at Ram if they end up coming out with the rumored turbo\'d Inline 6 motor. I have had 4-5 Rams over the last few years as rentals and put a couple thousand miles on them combined and they are very appealing trucks. The 8 speed is fantastic, the ride is better than any stock F150 I have been in and, as you mentioned, they are better optioned for the price than an F150. Most of the ones I have had were BigHorns and they were all nicely optioned. I did have a 2019 Ram Rebel in San Antonio and that was a cool rig with the factory lift and 33" duratracs.\n\nI\'ve been in a rental 2019 Big Horn the past week. It has heated steering wheel, power fold side mirrors, 4-auto (and it\'s a better 4-auto than my Lariat\'s by feel), remote start, heated seats, push button start (though no proxy entry), forward parking sensors, two tone soft touch interior, etc...and it\'s an XLT rival. That\'s kind of crazy.', 'Yup, as long as I need a half ton truck it will be an F-150.', "Quote:\n\n\r\n\t\t\tOriginally Posted by kozal01\n\n\nYup, as long as I need a half ton truck it will be an F-150.\n\nThis is also my response. My truck has been good to me, it's my first Ford and I like it compared to the last 3 trucks I've had since 2002 (Dodge, GMC, Titan). Although I'm thinking next year I may put the F250 option on the table.", 'Yes', "I've been in a rental 2019 Big Horn the past week. It has heated steering wheel, power fold side mirrors, 4-auto (and it's a better 4-auto than my Lariat's by feel), remote start, heated seats, push button start (though no proxy entry), forward parking sensors, two tone soft touch interior, etc...and it's an XLT rival. That's kind of crazy.", "Quote:\n\n\r\n\t\t\tOriginally Posted by blkZ28spt\n\n\nI've been in a rental 2019 Big Horn the past week. It has heated steering wheel, power fold side mirrors, 4-auto (and it's a better 4-auto than my Lariat's by feel), remote start, heated seats, push button start (though no proxy entry), forward parking sensors, two tone soft touch interior, etc...and it's an XLT rival. That's kind of crazy.\n\n\r\nHow can that be the XLT rival when you can't get some (most?) of those options on an XLT?\n\r\nMine will be a Ford.\n\r\nPeriod.", "I will probably buy Ford again. My truck is not perfect, but it has been pretty trouble free. It is certainly the quietest, most comfortable, most powerful vehicle I've owned.", "For me probably not. My decision was purely driven by the need to tow my trailer. I didn't want a 3/4 ton truck. I picked a half ton that could handle the most weight and power to pull, which this truck does well. However if I do decide to upgrade it'll be a Ram 2500, it's the best ride in a 3/4 ton. If not, no more trucks, I'm getting myself a Jaguar XJ.", 'I would.  Been very happy with my 4x4 RCSB.']