如何使用instagram API和python保存照片

时间:2015-09-30 19:25:27

标签: python api instagram

我正在使用Instagram API使用下面的python 3代码获取在特定位置拍摄的照片:

import urllib.request

wp = urllib.request.urlopen("https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351&access_token="ACCESS TOKEN")
pw = wp.read()
print(pw)

这允许我检索所有照片。我想知道如何在计算机上保存这些内容。

我的另一个问题是,运行上述图像返回的图像数量是否有限制?谢谢!

2 个答案:

答案 0 :(得分:1)

最终想出了这个。如果有人需要,请到这里:

#This Python Script will download 10,000 images from a specified location. 
# 10k images takes approx 15-20 minutes, approx 700 MB.


import urllib, json, requests
import time, csv

print "time.time(): %f " %  time.time()   #Current epoch time (Unix Timestamp)
print time.asctime( time.localtime(time.time()) )   #Current time in human readable format

#lat='48.858844' #Latitude of the center search coordinate. If used, lng is required.
#lng='2.294351'  #Longitude of the center search coordinate. If used, lat is required.

#Brooklyn Brewery
lat='40.721645'
lng='-73.957258'


distance='5000' #Default is 1km (distance=1000), max distance is 5km.
access_token='<YOUR TOKEN HERE>' #Access token to use API


#The default time span is set to 5 days. The time span must not exceed 7 days. 
#min_timestamp #    A unix timestamp. All media returned will be taken later than this timestamp.
#max_timestamp #    A unix timestamp. All media returned will be taken earlier than this timestamp.


#Settings for Verification Dataset of images
#lat, long =40.721645, -73.957258, dist = 5000, default timestamp (5 days)

images={}
#to keep track of duplicates

total_count=0
count=0
#count for each loop

timestamp_last_image=0
flag=0

#images are returned in reverse order, i.e. most recent to least recent
#A max of 100 images are returned in during each request, to get the next set, we use last image (least recent) timestamp as max timestamp and continue
#to avoid duplicates we check if image ID has already been recorded (instagram tends to return images based on a %60 timestamp)
#flag checks for first run of loop
#use JSON viewer http://www.jsoneditoronline.org/ and use commented API response links below to comprehend JSON response
while total_count<10000:
    if flag==0: 
        response = urllib.urlopen('https://api.instagram.com/v1/media/search?lat='+lat+'&lng='+lng+'&distance='+distance+'&access_token='+access_token+'&count=100')
        #https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351&distance=5000&access_token=2017228644.ab103e5.f6083159690e476b94dff6cbe8b53759
    else:
        response = urllib.urlopen('https://api.instagram.com/v1/media/search?lat='+lat+'&lng='+lng+'&distance='+distance+'&max_timestamp='+timestamp_last_image+'&access_token='+access_token+'&count=100')

    data = json.load(response)

    for img in data["data"]:
        #print img["images"]["standard_resolution"]["url"]
        if img['id'] in images:
            continue
        images[img['id']] = 1
        total_count = total_count + 1
        count=count+1
        urllib.urlretrieve(img["images"]["standard_resolution"]["url"],"C://Instagram/"+str(total_count)+".jpg")
        #above line downloads image by retrieving it from the url
        instaUrlFile.write(img["images"]["standard_resolution"]["url"]+"\n")
        #above line captures image url so it can be passed directly to Face++ API from the text file instaUrlFile.txt
        print "IMAGE WITH name "+str(total_count)+".jpg was just saved with created time "+data["data"][count-1]["created_time"]
    #This for loop will download all the images from instagram and save them in the above path

    timestamp_last_image=data["data"][count-1]["created_time"]
    flag=1
    count=0

答案 1 :(得分:0)

这里是保存所有图像的代码。 我无法测试它,因为我没有instagramm令牌。

import urllib, json


access_token = "ACCESS TOKEN" # Put here your ACCESS TOKEN
search_results = urllib.urlopen("https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351&access_token='%s'" % access_token)

instagram_answer = json.loads(search_results) # Load Instagram Media Result

for row in instagram_answer['data']:
    if row['type'] == "image": # Filter non images files
        filename = row['id']
        url = row['images']['standard_resolution']['url']
        file_obj, headers = urllib.urlretrieve(
            url=url,
            filename=url
        ) # Save images