如何获得用于登录的异步http请求以及如何累积结果?

时间:2019-04-19 17:02:04

标签: python aiohttp

我想在某些代码中包含异步http请求,但是在合并提供会话ID的登录名时遇到了麻烦。另外,我需要一些帮助,以了解积累响应的最佳方法,以便以后创建熊猫数据框。

我将在下面显示的代码基于以下博客文章,该文章描述了如何执行大量异步http请求:https://medium.com/@cgarciae/making-an-infinite-number-of-requests-with-python-aiohttp-pypeln-3a552b97dc95

此示例与我的示例之间的区别在于,我尝试包括以下内容:

  • 需要登录并在所有请求上传递会话ID(我认为我的代码没有这样做)。
  • 我的网址需要传递参数
  • 我想积累响应和一个索引(最初是从pandas数据框获取的),以便可以使用原始索引重新创建一个数据框。

我发现的大多数示例都是相对简单的,即使对于这个相对简单的示例,也很难将它们放在一起。另外,如果您可以在其他示例(例如Treq等)中为我提供示例,我也不会迷恋这些库。我是一名新的python程序员,因此这些代码中的大部分可能都是非pythonic的:)

import pandas as pd
from pypeln import TaskPool
from aiohttp import ClientSession, TCPConnector
import asyncio
from collections import OrderedDict

# payload is created using payload = df.to_dict("index", into=OrderedDict)
# these are the records (params) that i will need to iterate over and build # new urls/subsequent requests.
# i wanted the "index" version so I can tie the original index to the http # response and create a new dataframe in the end.

payload = OrderedDict([(0,
              {'_v3': 'ABC',
               '_v5': 10,
               'EVENT': 'Change'}),
             (1,
              {'_v3': 'CAT',
               '_v5': 115,
               'EVENT': 'Change'})])

# this returned session includes a session id that must be used
# on subsequent requests

async def login_async(session, host, username, password):
    sso_url = host + "/sso/SSOServ"
    login_data = {}
    login_data["_ssoUser"] = username
    login_data["_ssoPass"] = password
    login_data["_action"] = "LOGIN"
    login_data["_fromLoginPage"] = "TRUE"
    login_data["_ssoOrigUrl"] = host + "/app/portal/logondone.htm"
    login_data["_serviceName"] = "SSOP"
    response = await session.post(sso_url, data=login_data)
    print(response.headers["sso_status"])
    return session

# the below code structure was mostly taken from the blog
# r will be xml

async def fetch(session, url, params):
    async with session.get(url, params=params) as response:
        r = await response.text()
        return r


async def _main(url, payload, limit):
    connector = TCPConnector(limit=None)
    response_lst = []
    index_lst = []
    async with ClientSession(connector=connector) as session, TaskPool(limit) as tasks:
        await lawson_login_async(session=session, host=url, username='username1', password='password2')
        for params in payload.items():
            response = await tasks.put(fetch(session, url, params))
            #creating a dict from the xml (function not provided)
            response_lst.append(_xml_response_to_dict(response))
            index_lst.append(params[0])
        responses = await asyncio.gather(*response_lst)
        indexes = await asyncio.gather(*index_lst)
        # you now have all response bodies in this variable
        return (indexes, responses)



url = 'http://hosturl'
loop = asyncio.get_event_loop()
future = asyncio.ensure_future(_main(url, payload, limit=1000))
loop.run_until_complete(future)

#I would like to take the returned index list and response list and 
#create a dataframe

df_status = pd.DataFrame.from_records(future[1], index=future[0])

我是python异步编程的新手,所以我不确定我期望什么,但是我希望索引和响应从_main函数返回。使用这些列表,我想从结果构建一个数据框。我今天回来时将发布错误输出。

0 个答案:

没有答案