我有以下格式的3个文件 -
File1中:
ID Var1 Var2
001 5 10
002 12 6
文件2:
ID Var1 Var3 Var5
003 5 10 9
004 12 6 1
文件3:
ID Var3 Var4
005 5 10
006 12 6
我希望以下列格式输出
ID Var1 Var2 Var3 Var4 Var5
001 5 10 0 0 0
002 12 6 0 0 0
003 5 0 10 0 9
004 12 0 6 0 1
005 0 0 5 10 0
006 0 0 12 6 0
请告诉我如何在python
中做到这一点答案 0 :(得分:0)
如上所述,你应该看看csv模块,这里有一些东西可以帮助你入门。
outfile = open("output.txt", 'w')
for file_ in os.listdir("\path\to\my\files"):
with open(file_) as f:
for line_number, line in enumerate(file_):
if line_number > 0: #omit the headers
outfile.write(line)
使用python操作文件似乎是一个fairly common question on SO,也许你可以搜索其中的一些来看看其他人是如何做到的。
答案 1 :(得分:0)
#use fileinput module if you're reading multiple files at once
import fileinput
dic = {} # initialize an empty dict. This swill be used to store the value of
# (id,var) pair fetched from the file.
for line in fileinput.input(['file1','file2','file3']):
#if 'ID' is present in the line then it means it is the header line
if 'ID' in line:
vars = line.split()[1:] # extract the vars from it
# for file1 vars would be ['Var1', 'Var2']
else: #else it is normal line
spl =line.split() # split the line at whitespaces
# for the line '001 5 10\n' this would return
# ['001', '5', '10']
idx, vals = spl[0], spl[1:] # assign the first value from spl
# to idx and rest to vals
#now use zip to iterate over vars and vals, zip will return
#item on the same index from the iterables passed to it.
for x, y in zip(vars, vals):
dic[idx,x] = y # use a tuple ('001','Var1') as key and
# assign the value '5' to it. Similarly
# ('001','Var2') will be assigned '10'
#get a sorted list of unique vars and Ids
vars = sorted(set(item[1] for item in dic))
idxs = sorted(set(item[0] for item in dic), key = int)
print " ".join(vars) #print header
# now iterate over the IDs and for each ID print the pick var from Vars and print the
# value of (id,Var),,, etc.
for x in idxs:
# dict.get will return the default value '0' if a
# combination of (id,var) is not found in dict.
print x," ".join(dic.get((x,y),'0') for y in vars)
#use string formatting for better looking output.
<强>输出:强>
Var1 Var2 Var3 Var4 Var5
001 5 10 0 0 0
002 12 6 0 0 0
003 5 0 10 0 9
004 12 0 6 0 1
005 0 0 5 10 0
006 0 0 12 6 0
答案 2 :(得分:0)
为了合并多个文件,你可以使用这样的函数,利用Python的defaultdict
:
def read_from_file(filename, dictionary):
with open(filename) as f:
lines = f.read().splitlines()
head, body = lines[0].split(), lines[1:]
for line in body:
for i, item in enumerate(line.split()):
if i == 0:
d = dictionary[item]
else:
d[head[i]] = item
from collections import defaultdict
from pprint import pprint
d = defaultdict(defaultdict)
read_from_file("file1", d)
read_from_file("file2", d)
read_from_file("file3", d)
pprint(dict(d))
输出:
{'001': defaultdict(None, {'Var1': '5', 'Var2': '10'}),
'002': defaultdict(None, {'Var1': '12', 'Var2': '6'}),
'003': defaultdict(None, {'Var5': '9', 'Var1': '5', 'Var3': '10'}),
'004': defaultdict(None, {'Var5': '1', 'Var1': '12', 'Var3': '6'}),
'005': defaultdict(None, {'Var4': '10', 'Var3': '5'}),
'006': defaultdict(None, {'Var4': '6', 'Var3': '12'})}
现在剩下要做的就是将这本字典词典打印成一张桌子。