我尝试了这个漂亮的汤代码,从以下链接中刮取了来自Facebook的评论:https://python.gotrained.com/scraping-facebook-posts-comments/为了使该代码与网站上给出的主要完整代码分开运行,需要将用户名和密码放在结构化的json上凭证文件和要刮擦的公共Facebook页面列表(链接上均提供了示例)。我遵循了说明并运行了代码,但出现了以下错误:
INFO:root:[*] Logged in.
Traceback (most recent call last):
File "/Users/vivekrmk/Documents/Github_general/scrape_fb_beautiful_soup/facebook_scrapper_soup.py", line 215, in <module>
posts_data = crawl_profile(session, base_url, profile_url, 100)
File "/Users/vivekrmk/Documents/Github_general/scrape_fb_beautiful_soup/facebook_scrapper_soup.py", line 72, in crawl_profile
show_more_posts_url = profile_bs.find('div', id=posts_id).next_sibling.a['href']
AttributeError: 'NoneType' object has no attribute 'a'
当我在主代码中注释第70至76行时:
# show_more_posts_url = None
# if not posts_completed(scraped_posts, post_limit):
# show_more_posts_url = profile_bs.find('div', id=posts_id).next_sibling.a['href']
# profile_bs = get_bs(session, base_url+show_more_posts_url)
# time.sleep(3)
# else:
# break
我能够以json的形式获取输出,并且在注释字段之外的所有字段(即发布网址,发布文本和media_url)中都具有值-这是一个空白列表。在上述方面需要帮助,以便我也可以刮一下评论。预先感谢!