使用python从互联网下载编号文件夹中的编号图像文件

时间:2015-01-27 07:18:14

标签: python image http loops

问题

http://www.fdci.org/imagelibrary/EventCollection/1980/Big/IMG_2524.jpg

我有一个这种类型的链接,其中 1980 是正在更改的初始文件夹,其次是IMG_ 2524 .jpg格式的图像文件名正在发生变化。

我想做的是通过在文件夹和IMG_2000.jpg到IMG_4000.jpg的情况下迭代并更改1900-2000范围内的这些数字来从这些网址下载所有图像。 下载的文件必须保存在它来自的文件夹编号中。

我认为 for 循环应该是选项,但作为一个新手,我有点迷失。 请帮助谢谢。

更新

text_file = open('Output.txt', 'w')
for i in xrange(1900,2001):
    for j in xrange(2000, 4001):
        year = str(i)
        image = str(j)
        new_link = 'http://www.fdci.org/imagelibrary/EventCollection/'+year+'/Big/IMG_'+image+'.jpg'

        text_file.write(new_link)
text_file.close()

感谢anmol

1 个答案:

答案 0 :(得分:0)

实际上你需要两个for循环,嵌套for循环所以现在我们拥有范围2000 - 4000

范围内给定年份的所有1900 - 2001个图像
for i in xrange(1900,2001):
    for j in xrange(2000, 4001):
        year = str(i)
        image = str(j)
        new_link = 'http://www.fdci.org/imagelibrary/EventCollection/'+year+'/Big/IMG_'+image+'.jpg'
        print new_link
        #Now you will get the possible links within the given ranges,
        #then you can use urllib2 to fetch the response from the link 
        # and do whatever you wanna do 

示例输出:

http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_3994.jpg
http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_3995.jpg
http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_3996.jpg
http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_3997.jpg
http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_3998.jpg
http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_3999.jpg
http://www.fdci.org/imagelibrary/EventCollection/2000/Big/IMG_4000.jpg