我当时正在用bs4抓取weather-searched Google,而Python找不到<span>
标签。我该如何解决这个问题?
我试图用<span>
和 class
找到这个id
,但是都失败了。
<div id="wob_dcp">
<span class="vk_gy vk_sh" id="wob_dc">Clear with periodic clouds</span>
</div>
上面是我试图在page中抓取的HTML代码:
对不起,由于声誉我无法发布图片^^;
response = requests.get('https://www.google.com/search?hl=ja&ei=coGHXPWEIouUr7wPo9ixoAg&q=%EC%9D%BC%EB%B3%B8+%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E+%EB%82%B4%EC%9D%BC+%EB%82%A0%EC%94%A8&oq=%EC%9D%BC%EB%B3%B8+%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E+%EB%82%B4%EC%9D%BC+%EB%82%A0%EC%94%A8&gs_l=psy-ab.3...232674.234409..234575...0.0..0.251.929.0j6j1......0....1..gws-wiz.......35i39.yu0YE6lnCms')
soup = BeautifulSoup(response.content, 'html.parser')
tomorrow_weather = soup.find('span', {'id': 'wob_dc'}).text
但是此代码失败,错误是:
Traceback (most recent call last):
File "C:\Users\sungn_000\Desktop\weather.py", line 23, in <module>
tomorrow_weather = soup.find('span', {'id': 'wob_dc'}).text
AttributeError: 'NoneType' object has no attribute 'text'
请解决此错误。
答案 0 :(得分:4)
这是因为天气部分是由浏览器通过JavaScript呈现的。因此,当您使用requests
时,只会得到页面上没有所需内容的HTML内容。
如果要使用Web浏览器呈现的元素来解析页面,则应使用selenium
(或requests-html
)。
from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()
response = session.get('https://www.google.com/search?hl=en&ei=coGHXPWEIouUr7wPo9ixoAg&q=%EC%9D%BC%EB%B3%B8%20%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E%20%EB%82%B4%EC%9D%BC%20%EB%82%A0%EC%94%A8&oq=%EC%9D%BC%EB%B3%B8%20%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E%20%EB%82%B4%EC%9D%BC%20%EB%82%A0%EC%94%A8&gs_l=psy-ab.3...232674.234409..234575...0.0..0.251.929.0j6j1......0....1..gws-wiz.......35i39.yu0YE6lnCms')
soup = BeautifulSoup(response.content, 'html.parser')
tomorrow_weather = soup.find('span', {'id': 'wob_dc'}).text
print(tomorrow_weather)
输出:
pawel@pawel-XPS-15-9570:~$ python test.py
Clear with periodic clouds
答案 1 :(得分:0)
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(a)
>>> a
'<div id="wob_dcp">\n <span class="vk_gy vk_sh" id="wob_dc">Clear with periodic clouds</span> \n</div>'
>>> soup.find("span", id="wob_dc").text
'Clear with periodic clouds'
尝试一下。
答案 2 :(得分:-1)
我也遇到了这个问题。 你不应该这样导入
from bs4 import BeautifulSoup
你应该像这样导入
from bs4 import *
这应该有效。