Question

我正在向此网站https://www.everything5pounds.com/en/Womens/c/womens#/?q=&sort=newArrivals发送GET请求，并且得到的响应是页面来源（与该浏览器呈现的内容相同）

但是当我在Chrome中使用“网络”标签时，我看到的URL响应为JSON。奇怪的是尽管使用“ accept”：“ application / json”，但我仍无法获得JSON响应。

以下是我正在使用的代码。

import requests
from bs4 import BeautifulSoup

headers = requests.utils.default_headers()
headers.update({
    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0',
    'accept':'application/json'
})
url = 'https://www.everything5pounds.com/en/Womens/c/womens#/?q=&sort=newArrivals'
response = requests.get(url)
content = BeautifulSoup(response.content,'lxml')
print(content)

如果我做错了什么，请纠正我，否则请解释原因。

Answer 1

您的网址不正确：

import json
import requests
from pprint import pprint

url = 'https://www.everything5pounds.com/en/Womens/c/womens/results/?q=&sort=newArrivals'

data = json.loads(requests.get(url).text)
# You can get json also directly, no need to import json library:
# data = requests.get(url).json()


pprint(data)

打印：

{'currentQuery': ':newArrivals',
 'pagination': {'currentPage': 0,
                'numberOfPages': 458,
                'pageSize': 24,
                'sort': 'newArrivals',
                'totalNumberOfResults': 10973},
 'results': [{'availableForPickup': None,
              'availableInCurrentStore': None,
              'averageRating': 5.0,
              'badgeCode': None,
              'badgeUrl': None,
              'baseOptions': None,
              'baseProduct': None,
              'baseProductUrl': None,
              'categories': None,
              'categoryUrl': None,
              'classifications': None,
              'cleanUrl': '/Tie-Up-Cold-Shoulder-Dip-Hem-Dress/p/659773',
              'code': '659773',

...and so on.

无法获取JSON响应

1 个答案: