我正在尝试使用python中的BeautifulSoup查找视频标签的src
import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup as BS
url = '<some url>'
client_id = {'Client-ID': '<some id>'}
json_data = requests.get(url, headers=client_id).json()
def download(some_url) :
html_page = urlopen(some_url)
soup = BS(html_page, "html.parser")
link_to_vid = soup.find('video')['src']
print(link_to_vid)
# urllib.request.urlretrieve(video)
for x in range(0, num_clips):
resp_url = (json_data['data'][x]['url'])
print (resp_url)
download(resp_url)
运行此脚本时,我得到的输出是
*link from print(resp_url)*
Traceback (most recent call last):
File "script.py", line 28 in <module>
download(resp_url)
File "script.py", line 18 in download
link_to_vid = soup.find('video')['src']
TypeError: 'NoneType' object is not subscriptable
在我看来,由于BeautifulSoup在网页上找不到视频标签,因此发生了此错误。我尝试打印从BeautifulSoup获得的整个html页面,但似乎无法获得整个网页,至少没有从Chrome devtools中看到。
由于视频嵌套在div的深处,我是否收到此错误?为什么BeautifulSoup无法获得整个html页面?