使用相同的ClientSession获取多个不同的URL

时间:2018-10-29 15:20:34

标签: python beautifulsoup python-requests python-asyncio aiohttp

通常,我在请求中编写代码,因此,我对aiohttp的使用经验不足。但是由于请求被阻止,所以我必须使用aiohttp。

所以我的代码在请求中是什么样的:

#Account gen code is here using requests 

r = requests.get(product_link)

watch_link = soup(r.text, "html.parser").find("div", {"id": "vi-atl-lnk"}).a["href"]

r = requests.get(watch_link)
r = requests.get(watch_link)   

因此,它要做的是转到Ebay清单,然后使用BS4刮除该清单的源代码中的监视链接。然后,它使用GET请求将列表添加到监视列表。 “添加到监视列表”链接上必须有2个GET请求,否则实际上不会添加它。

这在请求中,但是现在我需要在aiohttp中编写它。我最接近的是这个:

session = aiohttp.ClientSession()

async def main():
    #Account gen code is here using aiohttp and session 
    async with session.get(product_link) as resp:
         r = await resp.text()
         watch_link = soup(r, "html.parser").find("div", {"id": "vi-atl-lnk"}).a["href"]
    async with session.get(watch_link) as respp:   
         time.sleep(.1)
    async with session.get(watch_link) as resp:   
         time.sleep(.1)

 loop = asyncio.get_event_loop()
 loop.run_until_complete(main())

我尝试了此操作,并为我运行,但是没有将其添加到监视列表中。上面的代码(未显示,因为它与此问题AFAIK不相关)没有问题,并且运行良好。但是,当涉及监视列表位时,它不起作用。可能是什么原因?

2 个答案:

答案 0 :(得分:2)

我尝试了很多次,终于发现它在cookie上有问题。并且您需要将代码更改为aiohttp.ClientSession(headers=headers)。 btw事实可能在cookie中,其中将;转换为\073

  

不是aiohttp.ClientSession(headers=headers,cookies=cookies)

我整理出了代码。

import aiohttp
import asyncio
from bs4 import BeautifulSoup as soup

product_link = ""

cookies = {"Cookie":"_ga=GA1.2.808...."}
headers = {"Connection": "keep-alive"}
headers.update(cookies)

async def main():
    #Account gen code is here using aiohttp and session 
    async with aiohttp.ClientSession(headers=headers) as sessions:

        async with sessions.get(product_link) as resp:
            r = await resp.text()
            watch_link = soup(r, "lxml").find("div", {"id": "vi-atl-lnk"}).a.get("href")
            print(watch_link)

        async with sessions.get(watch_link) as resp:
            pass

        async with sessions.get(watch_link) as resp:
            pass


loop = asyncio.get_event_loop()
loop.run_until_complete(main())

答案 1 :(得分:0)

        session = aiohttp.ClientSession()
        async with session.post(link,data=payload,headers=headers) as resp:
            print("Created account with email " + random_catchall)
        async with session.get(product_link,headers=headers) as response:
            r = await response.text()
            watch_link = soup(r, "html.parser").find("div", {"id": "vi-atl-lnk"}).a["href"]

            print("Connected to product")

        async with session.get(watch_link,headers=headers) as respp:
            print("Sucessfully added to watchlist")
        async with session.get(watch_link,headers=headers) as respp:
            print("Added to watch list")


        await session.close()

这最终为我工作。不需要Cookie或任何此类的东西:)