Question

我目前正在使用python通过包含instagram用户链接的文本文件从instagram上的用户收集信息。虽然我可以收集＃个关注者，＃个关注和＃个帖子，但我希望能够从用户那里收集生物信息。收集生物信息将使我最终能够解析该信息并收集电子邮件。我能做到的最好和最简单的方法是什么？

我对Python没有经验，所以我从互联网上获取了示例代码。我试图分析代码，并使用我所知的对其进行修改，但没有结果。

import requests
import urllib.request
import urllib.parse
import urllib.error
from bs4 import BeautifulSoup
import ssl
import json


class Insta_Info_Scraper:

    def getinfo(self, url):
        html = urllib.request.urlopen(url, context=self.ctx).read()
        soup = BeautifulSoup(html, 'html.parser')
        data = soup.find_all('meta', attrs= {'property':'og:description'})
        text = data[0].get('content').split()
        user = '%s %s %s' % (text[-3], text[-2], text[-1])
        followers = text[0]
        following = text[2]
        posts = text[4]
        email = ""
        print ('User:', user)
        print ('Followers:', followers)
        print ('Following:', following)
        print ('Posts:', posts)
        print ('Email:', email)
        print ('---------------------------')

    def main(self):
        self.ctx = ssl.create_default_context()
        self.ctx.check_hostname = False
        self.ctx.verify_mode = ssl.CERT_NONE

        with open('users.txt') as f:
            self.content = f.readlines()
        self.content = [x.strip() for x in self.content]
        for url in self.content:
            self.getinfo(url)


if __name__ == '__main__':
    obj = Insta_Info_Scraper()
    obj.main()

此刻，我将一个空字符串作为'email'变量的值，但最终希望将其替换为将从特定用户处获取电子邮件的代码。

Answer 1

最好的方法是使用instagram_private_api之类的第三方库。

示例：

from instagram_web_api import Client

web_api = Client(auto_patch=True, drop_incompat_keys=False)
user_info = web_api.user_info2('instagram')
print(user_info)

Answer 2

Instaloader是用于访问Instagram的公共数据结构的便捷工具，它是一个Python软件包，提供Python模块和CLI来访问Instagram。完成pip install instaloader的安装后，您可以使用

轻松地将配置文件的元数据保存在JSON文件中

instaloader --no-posts --no-profile-pic --no-compress-json profile1 [profile2 ...]

然后，您可以使用jq（一种轻巧灵活的命令行JSON处理器）提取刚刚保存的信息，例如以下命令将打印profile1的简介：

jq -r .node.biography profile1/profile1_*.json

同样，一种不离开Python来访问相同信息的方法：

import instaloader
L = instaloader.Instaloader()
profile = instaloader.Profile.from_username(L.context, 'profile1')
print(profile.biography)

使用python从Instagram收集用户信息

2 个答案: