你如何使用Python模仿Instagram上的无限滚动?

时间:2018-06-17 14:21:57

标签: python python-2.7 web-scraping instagram

我编写了一个小型Python程序,用于抓取Instagram个人资料以提取数据并显示各种统计数据。我可以从配置文件中的前9张照片中收集数据(或者在初始加载时出现多张照片),但我还没有能够加载其他照片(由于无限滚动机制)。我已经在线阅读了关于无限滚动的网页抓取,人们说你需要复制加载其他图片的请求。到目前为止,我一直无法复制请求,是否有人能够提供帮助?

谢谢!

1 个答案:

答案 0 :(得分:1)

无需再次编写所有代码,已经编写了许多库来复制所有请求。

一个这样的库是https://github.com/ping/instagram_private_api

使用此库的解决方案

from instagram_private_api import Client, ClientCompatPatch

user_name = 'YOUR_USERNAME'
password = 'YOUR_PASSWORD'
username_to_scrape = 'USERNAME_TO_SCRAPE'

all_posts = []

api = Client(user_name, password)
posts = api.username_feed(username_to_scrape)  #Gets the first 12 posts
# Extract the value *next_max_id* from the above response, this is needed to load the next 12 posts

next_max_id = posts["next_max_id"] 

all_posts = all_posts + posts

# 
next_page_posts = api.username_feed(track_username, max_id = next_max_id)

这只是一个帮助您入门的简单示例。

更新:保存&加载Cookie

#Saving cookies
cookies = api.cookie_jar.dump()
with open("cookies.pkl", "wb") as save_cookies:
    save_cookies.write(cookies)

#Loading cookies
with open("cookies.pkl", "rb") as read_cookies:
    cookies = read_cookies.read()

#Pass cookies to Client to resume session
api = Client(user_name, password, cookie = cookies)