我有每日温度文件,我想将其合并为一个年度文件 例如输入文件
Day_1.dat
Toronto -22.5
Montreal -10.6
Day_2.dat
Toronto -15.5
Montreal -1.5
Day_3.dat
Toronto -5.5
Montreal 10.6
所需的输出文件
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6
到目前为止,这是我为该程序的这一部分编写的代码:
#Open files for reading (input) and appending (output)
readFileObj = gzip.open(readfilename, 'r') #call built in utility to unzip file for reading
appFileObj = open(outFileName, 'a')
for line in readfileobj:
fileString = readFileObj.read(line.split()[-1]+'\n') # read last 'word' of each line
outval = "" + str(float(filestring) +"\n" #buffer with a space and then signal end of line
appFileObj.write(outval) #this is where I need formatting help to append outval
答案 0 :(得分:2)
这里对fileinput.input
的迭代允许我们迭代所有文件,一次获取一行。现在我们将每一行拆分为空白区域,然后使用城市名称作为关键字,我们在列表中存储相应的温度(或任何值)。
import fileinput
d = {}
for line in fileinput.input(['Day_1.dat', 'Day_2.dat', 'Day_3.dat']):
city, temp = line.split()
d.setdefault(city, []).append(temp)
现在d
包含:
{'Toronto': ['-22.5', '-15.5', '-5.5'],
'Montreal': ['-10.6', '-1.5', '10.6']}
现在,我们可以简单地迭代这个字典并将数据写入输出文件。
with open('output_file', 'w') as f:
for city, values in d.items():
f.write('{} {}\n'.format(city, ' '.join(values)))
<强>输出:强>
$ cat output_file
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6
请注意,词典没有任何特定顺序。因此,此处的输出可能先是Montreal
,然后是Toronto
。如果订单很重要,那么您需要使用collections.OrderedDict
。
您的代码的工作版本:
d = {}
#Considering you've a list of all `gzip` files to be opened.
for readfilename in filenames:
#populate the dictionary by collecting data from each file
with gzip.open(readfilename, 'r') as f:
for line in f:
city, temp = line.split()
d.setdefault(city, []).append(temp)
#Now write to the output file
with open(outFileName, 'w') as f:
for city, values in d.items():
f.write('{} {}\n'.format(city, ' '.join(values)))