我有大量文件存储在文件夹中。我需要对每个文件对执行相同的python脚本,并将结果输出到txt或excel文件中。如何自动化进程python或批处理脚本?
例如。文件夹中的文件
ap.hdf5
sta.hdf5
ap_20150909_154518_00.hdf5
sta_20150909_154518_00.hdf5
ap_20150909_154530_00.hdf5
sta_20150909_154530_00.hdf5
ap_20150909_154541_00.hdf5
sta_20150909_154541_00.hdf5
这些文件是根据修改的数据排列的。我需要对每个对执行相同的python脚本并在文本文件中输出结果 例如
python result.py ap.hdf5 sta.hdf5
python result.py ap_20150909_154518_00.hdf5 sta_20150909_154518_00.hdf5
如何创建可以自动执行流程的批处理文件? 提前谢谢。
编辑: 文件夹内的文件略有不同。
ap.hdf5
sta.hdf5
ap_20150909_154518_00.hdf5
sta_20150909_154524_00.hdf5
ap_20150909_154530_00.hdf5
sta_20150909_154536_00.hdf5
ap_20150909_154541_00.hdf5
sta_20150909_154547_00.hdf5
这里,在AP.hdf5
之后的几秒钟记录Sta文件答案 0 :(得分:2)
您可以使用glob module获取以 ap 开头的文件列表。然后,您可以将 ap 更改为 sta 以获取配对文件的名称(假设始终存在一对文件)。如果你有这个,你可以处理它们(就像你之前做的那样)。
import glob
# iterate over all files starting with ap and ending in .hdf5
for file_a in glob.iglob("ap*.hdf5"):
# replace the beginning of the filename with sta
file_b = "sta" + file_a[2:]
# do your processing (result.py) using file_a and file_b as your pair
答案 1 :(得分:1)
如果您真的不想修改result.py
来执行此操作,可以使用以下方法:
import subprocess
import glob
import os.path
with open('output.txt', 'w') as f_output:
# ap files sorted by modified order
files = sorted(glob.glob(r'ap*.hdf5'), key=lambda x: os.path.getmtime(x))
for ap in files:
path, filename = os.path.split(ap)
sta = os.path.join(path, 'sta{}'.format(filename[2:]))
# Do we have an ap/sta pair?
if os.path.exists(sta):
# Launch the Python script with the required parameters
p = subprocess.Popen(['python.exe', 'result.py', ap, sta], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
# Write stdout to a file and stderr to the screen
f_output.write(out)
print err
else:
print '{} is missing'.format(sta)
这将按照修改日期的顺序为每个ap sta
文件对运行Python脚本,并将脚本中的任何输出写入output.txt
更新 - 根据更新的问题,以下脚本将根据文件名配对ap
和sta
个文件。如果找不到合适的一对,它将停止:
import subprocess
import glob
import os.path
import itertools
def sort_by_ending(filename):
filename = os.path.split(filename)[1]
if '_' in filename:
return filename.split('_')[1:]
else:
return [filename]
folder = r'c:\test'
with open('output.txt', 'w') as f_output:
# ap and sta files sorted by filename ending
files = sorted(glob.glob(os.path.join(folder, 'ap*.hdf5')) + glob.glob(os.path.join(folder, 'sta*.hdf5')), key=sort_by_ending)
for ap, sta in itertools.izip(*[iter(files)]*2):
print "'{}' and '{}'".format(os.path.split(ap)[1], os.path.split(sta)[1])
# Do we have an ap/sta pair?
if os.path.split(ap)[1].startswith('ap') and os.path.split(sta)[1].startswith('sta'):
# Launch the Python script with the required parameters
p = subprocess.Popen(['python.exe', 'result.py', ap, sta], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
# Write stdout to a file and stderr to the screen
f_output.write(out)
print err
else:
print '{} is missing'.format(sta)
break
对于您给定的文件名示例,这将打印以下输出:
'ap_20150909_154518_00.hdf5' and 'sta_20150909_154524_00.hdf5'
'ap_20150909_154530_00.hdf5' and 'sta_20150909_154536_00.hdf5'
'ap_20150909_154541_00.hdf5' and 'sta_20150909_154547_00.hdf5'
'ap.hdf5' and 'sta.hdf5'
答案 2 :(得分:0)
import glob, os
ap_files = glob.glob("outdir/ap*.hdf5") #use glob to get ap files
for ap_file in ap_files: #walk through the ap file names
basename = os.path.basename(ap_file) #split of the base name
sta_file = "outdir/" + basename.replace("ap","sta",1) #make the sta name
dothingsto(ap_file, sta_file) #do whatever you wish to the two files