Question

我想使用python抓取数据我试过再试一次但它不起作用我找不到代码的错误我写了这样的代码：

import re
import requests
from bs4 import BeautifulSoup

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1'
html=requests.get(url)
#print(html.text)
a=html.text
bs=BeautifulSoup(a,'html.parser')
print(bs)
print(bs.find('span',attrs={"class" : "u_cbox_contents"}))

我想抓取新闻中的回复数据

正如你所看到的，我试图灼烧这个：

bs中的

span，class =“u_cbox_contents”

但是python只说“无”

无

所以我用功能print（bs）检查bs

我检查了bs变量的内容

但没有span，class =“u_cbox_contents”

为什么会这么讨厌？

我真的不知道为什么

请帮帮我

感谢阅读。

Answer 1

请求将获取URL的内容，但不会执行任何JavaScript。

我使用cURL执行了相同的提取，我在HTML代码中找不到任何u_cbox_contents。最有可能的是，它是使用JavaScript注入的，这解释了为什么BeautifulSoup找不到它。

如果您需要在“普通”浏览器中呈现的页面代码，则可以尝试Selenium。另请查看this SO问题。

使用python html错误抓取web数据

1 个答案: