我需要迭代的文件夹中有大量文本文件(超过2000个)。这就是我目前所做的事情:
import os
filepath='E:\Data'
save_path='E:\Results'
for file in os.listdir(filepath):
if file.endswith('.txt'):
with open(os.path.join(filepath,file),'r') as myfile:
for eachline in myfile:
MainID=eachline[:6]
if MainID=='AKJ':
for field in eachline.split():
MainID=field.split(',')[1]
Origin=field.split(',')[9]
Price=field.split(',')[13]
fo1=open(os.path.join(save_path,file),'a')
fo1.write('%s,%s,%s\n' %(MainID,Origin,Price))
fo1.close()
但是,我需要为前100个文件执行我的进程,然后对于以下100个文件等执行,直到文件夹结束,而不是遍历所有文件,直到结束,如同上面的代码。任何帮助将不胜感激。
答案 0 :(得分:1)
files = [file for file in os.listdir(filepath) if file.endswith('.txt')]
batchsize = 100
index = 0
remaining = len(files)
while remaining > 0:
batch = min(remaining, batchsize)
print('NEW BATCH')
for file in files[index:index+batch]:
with open(os.path.join(filepath, file), 'r') as myfile:
print(' ', file)
index += batch
remaining -= batch
答案 1 :(得分:0)
complete_file_paths = [os.path.join(filepath,file) for file in os.listdir(filepath)]
chunks_of_100 = (complete_file_paths[i:i+100] for i in range(0,len(complete_file_paths),100))
for chunk in chunks_of_100:
print chunk