在BeautifulSoup中的div中从视频标签中找到src

时间:2018-11-22 09:01:44

标签: python html beautifulsoup

我正在尝试使用python中的BeautifulSoup查找视频标签的src

import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup as BS

url = '<some url>'
client_id = {'Client-ID': '<some id>'}

json_data = requests.get(url, headers=client_id).json()

def download(some_url) :
    html_page = urlopen(some_url)
    soup = BS(html_page, "html.parser")

    link_to_vid = soup.find('video')['src']

    print(link_to_vid)

    # urllib.request.urlretrieve(video)


for x in range(0, num_clips):
    resp_url = (json_data['data'][x]['url'])
    print (resp_url)
    download(resp_url)

运行此脚本时,我得到的输出是

*link from print(resp_url)*
Traceback (most recent call last):
  File "script.py", line 28 in <module>
      download(resp_url)
  File "script.py", line 18 in download
      link_to_vid = soup.find('video')['src']
TypeError: 'NoneType' object is not subscriptable

在我看来,由于BeautifulSoup在网页上找不到视频标签,因此发生了此错误。我尝试打印从BeautifulSoup获得的整个html页面,但似乎无法获得整个网页,至少没有从Chrome devtools中看到。

由于视频嵌套在div的深处,我是否收到此错误?为什么BeautifulSoup无法获得整个html页面?

0 个答案:

没有答案