Question

我试图构建Instagram帐户的Microsoft Access数据库，并希望提取以下数据：

帐户名称
粉丝数
关注的人数
帖子数量（及其日期）
图片的数量
图片评论数

我在构建数据库方面没有任何问题，但想知道是否有更简单/更快捷的方式来获取所有信息，而无需查看每个单独的图片/帐户并选择信息。

Microsoft Access是最好的方法吗？有更好的解决方案吗？

Answer 1

好吧，如果这个问题有网页刮痧＆＃39;关键字然后允许我   在这里分享一些信息..

Instagram的html源代码中包含JavaScript JSON数据   通过链接显示用户的信息，例如   https://www.instagram.com/user-account/。您可以通过解析这些数据   任何脚本语言都可以获取JSON数据。

Instagram在单一请求中只显示10个帖子，你可以看到   用户的基本信息，如用户名，传记，没有帖子，没有   粉丝和关注。但是，如果我们需要所有的喜欢和评论   每张照片的所有图片或喜欢和评论。然后   我们必须点击他们的 Load more ＆＃39;按钮。

加载更多请求Ajax呼叫包括＆＃39; ？max_id ＆＃39;这给你下一个   10个帖子信息。所以你必须创建一个发送/获取的Post循环   休息信息直到＆＃39; max_id＆＃39;空或空。

示例请求：第一页，https://www.instagram.com/demo-user/

下一个数据请求：https://www.instagram.com/demo-user/?max_id=1533276522

依旧......

最近我有空闲时间，我在Instagram上生气;）所以制作了一个解决所有这些问题的脚本。这适用于PHP和代码评论很好，所以我认为这不会导致任何问题了解应用程序流程。您可以看到脚本，它是如何工作的＆安培;可以使用任何其他语言的逻辑。

这来自GitHub Repository Code

＆amp; ..是的，它不需要Instagram API或其他..：）

Answer 2

为什么不直接用url查看json数据：

https://www.instagram.com/ /？__一个= 1

Answer 3

你一定要查看Instagram的API，它可以为你提供你想要抓取的所有公共信息。您只需要编写一个脚本来进行正确的API调用（如下所示）。

来自Instagram的网站：

我们尽力让所有网址都是RESTful。每个端点（URL）可以支持四种不同的http动词之一。 GET请求获取有关对象的信息，POST请求创建对象，PUT请求更新对象，最后DELETE请求将删除对象。

当您在代码中使用URL时，您只需要为相关帐户准备好 ACCESS-TOKEN 值，并且能够解压Instagram返回给您的json GET请求。如果数据不是直接可用，您可以随时间接退出。 - 用户名 - 粉丝数量 - 遵循的人数

这是一个很好的起点： https://www.instagram.com/developer/endpoints/users/#get_users

以下是在python中调用API的方法：

#Python 2.7.6
#RestfulClient.py

import requests
from requests.auth import HTTPDigestAuth
import json

# Replace with the correct URL
url = "http://api_url"

# It is a good practice not to hardcode the credentials. So ask the user to enter credentials at runtime
myResponse = requests.get(url,auth=HTTPDigestAuth(raw_input("username: "), raw_input("Password: ")), verify=True)
#print (myResponse.status_code)

# For successful API call, response code will be 200 (OK)
if(myResponse.ok):

    # Loading the response data into a dict variable
    # json.loads takes in only binary or string variables so using content to fetch binary content
    # Loads (Load String) takes a Json file and converts into python data structure (dict or list, depending on JSON)
    jData = json.loads(myResponse.content)

    print("The response contains {0} properties".format(len(jData)))
    print("\n")
    for key in jData:
        print key + " : " + jData[key]
else:
  # If response code is not ok (200), print the resulting http error code with description
    myResponse.raise_for_status()

Answer 4

此仓库包含所有内容：https://github.com/rarcega/instagram-scraper

请正确阅读选项。

instagram-scraper incindia -m 500 --media-metadata --include-location --media-types none给了我一个json：

媒体图片的网址
媒体类型，观看次数
喜欢的次数，评论的数量（--comment也为您提供所有评论）

还有更多让我探索的东西。

您还可以下载所有媒体

如何提取Instagram数据

4 个答案: