如果我访问此页面here,经过检查,我可以在页面上看到带有img
标签的图像。
但是,当我尝试使用requests
获取页面并使用BeautifulSoup
进行解析时,我无法访问同一张图片。我在这里想念什么?
代码工作正常,我从请求中获得200作为status_code。
import requests
from bs4 import BeautifulSoup
url = 'https://mangadex.org/chapter/435396/2'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'}
page = requests.get(url,headers=headers)
print(page.status_code)
soup = BeautifulSoup(page.text,'html.parser')
img_tags = soup.find_all('img')
for img in img_tags:
print(img)
编辑::
根据建议,硒选项可以正常工作。但是有没有办法像BeautifulSoup一样加快速度?
答案 0 :(得分:1)
页面上的JavaScript需要运行才能填充页面上的某些元素。您可以在访问图像之前使用Selenium运行页面的JavaScript。
答案 1 :(得分:0)
您可以使用API获取图像。下面的代码从页面获取所有图像并打印URL:
import requests
headers = {
'Accept': 'application/json, text/plain, */*',
'Referer': 'https://mangadex.org/chapter/435396/2',
'DNT': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/73.0.3683.86 Safari/537.36',
}
params = (
('id', '435396'),
('type', 'chapter'),
('baseURL', '/api'),
)
response = requests.get('https://mangadex.org/api/', headers=headers, params=params)
data = response.json()
img_base_url = "https://s4.mangadex.org/data"
img_hash = data["hash"]
img_names = data["page_array"]
for img in img_names:
print(f"{img_base_url}/{img_hash}/{img}")
输出:
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x1.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x2.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x3.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x4.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x5.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x6.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x7.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x8.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x9.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x10.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x11.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x12.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x13.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x14.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x15.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x16.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x17.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x18.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x19.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x20.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x21.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x22.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x23.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x24.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x25.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x26.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x27.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x28.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x29.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x30.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x31.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x32.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x33.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x34.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x35.png
https://s4.mangadex.org/data/ac081a99e13d8765d48e55869cd5444c/x36.png