递归处理分页

时间:2019-05-08 13:52:30

标签: python django python-requests

我正在使用requests lib从远程服务器中获取数据,并将数据保存在模型中,但是我需要处理分页,目前我仅从服务器加载一页。

我有这样的分页网址

{
"status": "success",
"count": 32,
"total": 32,
"next": "https://pimber.ly/api/v2/products?sinceId=5c3ca8470985af0016229b5b",
"previous": "https://pimber.ly/api/v2/products?maxId=5c3ca8470985af0016229b04",
"sinceId": "5c3ca8470985af0016229b04",
"maxId": "5c3ca8470985af0016229b5b",
"data": [
    {
        "Primary ID": "API_DOCS_PROD1",
        "Product Name": "Example Product 1",
        "Product Reference": "Example Reference 1",
        "Buyer": "Example Buyer 1",
        "_id": "5c3ca8470985af0016229b04",
        "primaryId": "API_DOCS_PROD1"
    },

我试图使用python生成器来处理当前情况,但是,它什么也没做

_plimber_data = response.json()
yield _plimber_data
_next = _plimber_data['next']
print(_next)
for page in _next:
    _next_page = session.get(_plimber_data, params={'next': page}).json()
    yield _next_page['next']

    for _data in page:
        Product.objects.create(
            qr_id=_data['primaryId'],
            ean_code=_data['EAN'],
            description=_data['Description105'],
            category=_data['Category'],
            marketing_text=_data['Marketing Text'],
            bullet=_data['Bullet 1'],
            brand_image=_data['Brand Image'],
            image=_data['Images']
        )
        logger.debug(f'Something went wrong {_data}')
        print(f'This is the Data:{_data}')

有人可以解释一下该如何处理,以便可以将所有数据加载到数据库中。

1 个答案:

答案 0 :(得分:0)

好的,我已经解决了,两个人认为第一个生成器函数

def _get_product():
    """
    TODO: Fetch data from server
    """
    headers = {
        'Accept': 'application/json',
        'Content-Type': 'application/json',
        'Authorization': settings.TOKEN
    }

    try:
        response = requests.get(
            url=f'{settings.API_DOMAIN}',
            headers=headers
        )
        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')

    else:
        _plimber_data = response.json()
        while _plimber_data['next'] is not None:
            response = requests.get(
                _plimber_data['next'],
                headers=headers
            )
            _plimber_data = response.json()
            for _data in _plimber_data['data']:
                yield _data

然后我遍历生成器函数,并保存数据:

    def run(self):
    _page_data = _get_product()
    for _product in _page_data:
        Product.objects.create(
            qr_id=_product['primaryId'],
            ean_code=_product['EAN'],
            description=_product['Description105'],
            category=_product['Category'],
            marketing_text=_product['Marketing Text'],
            bullet=_product['Bullet 1'],
            brand_image='\n'.join(_product['Brand Image']),
            image='\n'.join(_product['Images'])
        )
        logger.debug(f'Something went wrong {_product}')
        print(f'This is the Data:{_product}')