Question

我正在尝试使用requests.get()获取数据。并且响应数据很大（包含10000个mongodb记录）。但我得到的反应几乎总是被打破。很少有人得到正确的结果。

例：
应该是这样的：

[
    {
        "_id":"5a72c839c634133e1e9ab502",
        "data":{"today_wh":13500},
        "dts":"2018-02-01T07:56:31.000Z",
        "ts":1517471791
    },
    {
        "_id":"5a72c839c634133e1e9ab503",
        "data":{"today_wh":13500},
        "dts":"2018-02-01T07:57:06.000Z",
        "ts":1517471826
    }
]

像这样：

[
    {
        "_id":"5a72c8ecc634133e1e9ab51b",
        "data":{"today_wh":13700},
        "dts":"2018-02-01T08:00:01.000Z",
        "ts":1517472001
    },
    {
        "_id":

如何处理整个结果？

Answer 1

按请求设置的默认User-Agent是＆＃39; User-Agent＆＃39;＆＃39; python-requests / 2.7.6＆＃39;。尝试模拟它来自浏览器而不是脚本。尝试按如下方式模拟User-Agent：

import requests
url = "http://example.com/"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36',
    'Content-Type': 'text/html',
}
response = requests.get(url, headers=headers)
html = response.text

在Python 3中，requests.get（）给出了不完整的json响应

1 个答案: