批量导入文件从目录到python脚本

时间:2018-02-23 16:52:37

标签: python import directory

我想从我的python脚本导入来自特定目录的所有jpg图像,但不是一次全部导入,而是每次500张图像。

一种可能的解决方案如下:

from glob import glob

i = 0
batch= 500
# Read images from file
for filename in glob('Directory/*.jpg'):
    i = i + 1
    if i % batch == 0:
        # apply an algorithm with this data-batch #

这是对的吗?

有没有更有效的方法来做到这一点?

2 个答案:

答案 0 :(得分:1)

batch_size = 500
filenames = glob(...)    # fill with your own details
nfiles = len(filenames)
nbatches, remainder = divmod(nfiles, batch_size)
for i in xrange(nbatches):   # or range() for Python 3
    batch = filenames[batch_size * i:batch_size * (i + 1)]
    do_something_with(batch)
if remainder:
    do_something_with(filenames[batch_size * nbatches:])

使用生成器从可能无结束的可迭代中获取每个N元素的版本:

def every(thing, n):
    """every(ABCDEFG, 2) --> AB CD EF G"""                                      
    toexit = False
    it = iter(thing)
    while not toexit:
        batch = []
        for i in xrange(n):
            try:
                batch.append(it.next())
            except StopIteration:
                toexit = True
        if not batch:
            break
        yield batch


filenames_i = glob.iglob("...")
for batch in every(filenames_i, 500):
    do_something_with(batch)

这会使批次上的迭代更加简洁(此代码段中的for batch in every())。

答案 1 :(得分:1)

from os import listdir

directory = 'Directory/*.jpg'

fnames = list(fname for fname in listdir(directory) if fname.endswith('.jpg'))

batchsize = 500
l_index, r_index = 0, batchsize
batch = fnames[l_index:r_index]

while batch:

    for i in batch:
        import_function(i) 
    l_index, r_index = r_index, r_index + batchsize
    batch = fnames[l_index:r_index]