Question

我正在尝试使用python beautifulsoup库从网站html中提取少量元素。问题是响应中的HTML与我在浏览器上看到的不同。这是代码：

import requests
from bs4 import BeautifulSoup
import pandas as pd

url = 'https://www.nutritionix.com/brands/restaurant'

resp = requests.get(url,verify=True)
soup = BeautifulSoup(resp.content)

我尝试过使用urllib库并使用了浏览器代理参数，但它没有用。

有关如何解决此问题的任何建议吗？

Answer 1

页面由javascript生成。

尝试使用firebug或google开发工具。 enter image description here

您想要的数据实际来自 https://d1gvlspmcma3iu.cloudfront.net/brands-restaurant.json.gz

通过请求请求的HTML内容与浏览器不同

1 个答案: