我想从我的python脚本导入来自特定目录的所有jpg图像,但不是一次全部导入,而是每次500张图像。
一种可能的解决方案如下:
from glob import glob
i = 0
batch= 500
# Read images from file
for filename in glob('Directory/*.jpg'):
i = i + 1
if i % batch == 0:
# apply an algorithm with this data-batch #
这是对的吗?
有没有更有效的方法来做到这一点?
答案 0 :(得分:1)
batch_size = 500
filenames = glob(...) # fill with your own details
nfiles = len(filenames)
nbatches, remainder = divmod(nfiles, batch_size)
for i in xrange(nbatches): # or range() for Python 3
batch = filenames[batch_size * i:batch_size * (i + 1)]
do_something_with(batch)
if remainder:
do_something_with(filenames[batch_size * nbatches:])
使用生成器从可能无结束的可迭代中获取每个N
元素的版本:
def every(thing, n):
"""every(ABCDEFG, 2) --> AB CD EF G"""
toexit = False
it = iter(thing)
while not toexit:
batch = []
for i in xrange(n):
try:
batch.append(it.next())
except StopIteration:
toexit = True
if not batch:
break
yield batch
filenames_i = glob.iglob("...")
for batch in every(filenames_i, 500):
do_something_with(batch)
这会使批次上的迭代更加简洁(此代码段中的for batch in every()
)。
答案 1 :(得分:1)
from os import listdir
directory = 'Directory/*.jpg'
fnames = list(fname for fname in listdir(directory) if fname.endswith('.jpg'))
batchsize = 500
l_index, r_index = 0, batchsize
batch = fnames[l_index:r_index]
while batch:
for i in batch:
import_function(i)
l_index, r_index = r_index, r_index + batchsize
batch = fnames[l_index:r_index]