Question

我正在使用python 2.6

我输入n个文件并使用循环处理文件中的数据并将该信息输出到单个输出文件。

输入文件名为inputfile_date_time.h5，其中每个输入文件的每个日期/时间都不同。

我希望命名输出文件outputfile_firstdate_firsttime_lastdate_lasttime.pkt - 其中firstdate_firsttime是第一次输入文件的日期和时间（也就是输入文件名称的第一部分） n个文件的序列），其中lastdate_lasttime是输入文件的最后一次日期和时间（也就是n文件序列中最后一个输入文件名称的一部分）

我的代码目前设置如下：

import os
from glob import glob
from os.path import basename
import numpy
import hdf5
#set location/directory of input files
inputdir = "/Location of directory that contains files"

#create output file
outputfilename = 'outputfilename'
outputfile = "/Location to put output file/"+basename(outputfilename)[:-4]+".pkt"
ofile = open(outputfile, 'wb')

for path, dirs, files in os.walk(inputdir):
    files_list = glob(os.path.join(inputdir, '*.h5'))
    for file in files_list:
        f = h5py.File(os.path.join(files_list,file), 'r')
        f.close()
    #for loop performing the necessary task to the information in the files
    #print that the output file was written
    print "Wrote " + outputfile
#close output file
ofile.close()

此代码创建一个名为outputfile.pkt

的输出文件

如何调整此代码以进行我之前所说的更改？

Answer 1

time.strptime可以解析您想要的任何时间格式，time.strftime可以生成您想要的任何时间格式。您应该阅读（并可能解析）所有这些内容，并使用min(...)和max(...)来获取最小和最大的内容。

例如，如果文件名看起来像foo2014-06-16bar.txt和hello2014-06-17world，那么以下是解析它们的方法：

import re
files = ['foo2014-06-16bar.txt', 'hello2014-06-17world'
dates = [re.search(r'(?:19|20)\d{2}-\d{2}-\d{2}', f).group() for f in files]
print min(dates)  #: 2014-06-16
print max(dates)  #: 2014-06-17

以下是使用files构建os.walk的方法：

import os
inputdir = "/Location of directory that contains files"
files = []
for dirpath, dirnames, filenames in os.walk(inputdir):
  for filename in filenames:
    if filename.endswith('.h5'):
      pathname = os.path.join(dirpath, filename)
      files.append(pathname)
print files

Python - 名称输出文件，包含输入文件名的一部分

1 个答案: