Question

所以几周前，我写了这个程序，成功地在一些在线商店中删除了一些信息，但是现在它停止工作了，而无需我更改代码？

这可能是网站内部已更改的内容，还是我的代码有问题？

import requests
from bs4 import BeautifulSoup

url = 'https://www.continente.pt/stores/continente/pt-pt/public/Pages/ProductDetail.aspx?ProductId=7104665(eCsf_RetekProductCatalog_MegastoreContinenteOnline_Continente)'

res = requests.get(url)
html_page = res.content
soup = BeautifulSoup(html_page, 'html.parser')

priceInfo = soup.find('div', class_='pricePerUnit').text

priceInfo = priceInfo.replace('\n', '').replace('\r', '').replace(' ', '')

productName = soup.find('div', class_='productTitle').text.replace('\n', ' ')

productInfo = (soup.find('div', class_='productSubtitle').text
               + ', ' + soup.find('div', class_='productSubsubtitle').text)

print('Nome do produto: ' + productName)
print('Detalhes: ' + productInfo)
print('Custo: ' + priceInfo)

我知道一个事实，即我正在搜索的内容确实存在，并且该网址仍然有效，那么可能是什么问题呢？我将priceInfo分为两行，因为第一个声明中存在错误，因为它返回了一个没有文本属性的NoneType

Answer 1

解决方案有点多步骤。

尝试一次调用要在Firefox中抓取的页面
使用browser_cookie3 lib提取Cookie
确保它们没有过期
在request.get（url，cookies = browser_cookie3.firefox（））中使用cookie
使用以下标题

希望它有效！！报废快乐

我自己尝试过，它可以正常工作！

 headers = {
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-User': '?1',
    'Sec-Fetch-Dest': 'document',
    'Accept-Language': 'en-US,en;q=0.9,de;q=0.8',
}

从特定网站报废已停止工作

1 个答案: