我有这种格式的文件:
a11 0.0
a12 132.0
b13 0.0
b42 584.0
randomstuff
etc
a11 0.0
a12 6.0
b13 138.0
b42 6.0
有成千上万的a ##,b ##,c ##等组合,但是它们之间反复地重复着一些无用的东西。我想为每个项目添加所有数字,所以我只有:
a11, 0
a12, 138
b13, 138
b42, 590
我需要某种方式来生成每个标识符(a11,a12等),因为有成千上万个不同的标识符。
答案 0 :(得分:1)
要生成所有组合,一个简单的方法就是3次循环:
for letter in 'abcdefghijkmnopqrstuvwxyz':
for digit1 in '0123456789':
for digit2 in '0123456789':
print(letter + digit1 + digit2)
哪个生成a00
-> z99
但是要解析此数据,检查输入行是否遵循格式,然后将其汇总为字典可能更容易
code_sums = {} # empty dictionary
lines = open("input_file.txt", "rt").readlines()
for row in lines:
# check the line is good input
# cleanup and single space
row = row.strip().replace('\t', ' ')
while (row.find(' ') != -1):
row = row.replace(' ', ' ') # double space to single
# verify there's only two values in the line
if (len(row.split(' ')) == 2):
code, value = row.split(' ')
if (len(code) == 3 and
code[0] in 'abcdefghijklmnopqrstuvwxyz' and
code[1].isdigit() and
code[2].isdigit()):
try:
float_val = float(value)
# looks like we have valid input, tally the value
if (code in code_sums):
code_sums[code] += float_val
else:
code_sums[code] = float_val
except:
pass # probably a malformed input line
#for code in code_sums.keys():
# print("%s -> %7.1f" % (code, code_sums[code]))
fout = open("output_file.csv", "wt") # TODO - handle errors
fout.write("Code,Sum\n")
for code in code_sums.keys():
fout.write("%s,%7.1f\n" % (code, code_sums[code]))
fout.close()