我有2列。
RAB10 -0.016575
MEF2C -0.019655
SP2 1.24E-01
SP2 -0.18870625
SP2 0.0879425
我想输出
RAB10 -0.016575
MEF2C -0.019655
SP2 1.24E-01
-0.18870625
0.0879425
对于重复键,我只能得到1个键和所有相应的值。
我为它编写了这段代码:
import math
import numpy
import csv
import collections
from decimal import *
from collections import defaultdict
with open('output.csv','rb') as file:
contents = csv.reader(file)
#storing content of Common genes Result edited file in matrix
matrix = list()
for row in contents:
matrix.append(row)
#to get both the index and the item
for index, item in enumerate(matrix):
#to access 2nd column value
first_column = [ row[0] for row in matrix ]
second_column = [ row[1] for row in matrix
for q, a in zip (first_column, second_column):
if q==q
print (Format (q,a))
此代码仅返回键和值,但不将值与1个重复键合并。
答案 0 :(得分:2)
您需要实际使用默认值,附加值。
from collections import defaultdict
d = defaultdict(list)
with open('output.csv') as f:
for line in f: # loop over each line
spl = line.split() # split , "RAB10 -0.016575" -> ["RAB10","-0.016575"]
d[spl[0]].append(spl[1]) # append value
print(d)
defaultdict(<type 'list'>, {'MEF2C': ['-0.019655'], 'RAB10': ['-0.016575'], 'SP2': ['1.24E-01', '-0.18870625', '0.0879425']})
如果您希望值为float,请使用d[spl[0]].append(float(spl[1]))
,但前提是您确定所有数据的格式相同,否则您需要try/except
块。
from collections import defaultdict
d = defaultdict(list)
with open("output.csv") as f:
for line in f:
spl = line.split()
try:
d[spl[0]].append(float(spl[1]))
except (ValueError,IndexError): # catch lines that don't have at least two elements or the second element is not a float
continue
答案 1 :(得分:0)
一旦你有了Padraic Chunningham的答案中的defaultdict d
,就可以轻松打印出来了:
for key, values in sorted(d.iteritems()):
values_iter = iter(values)
print('%s\t%s' % (key, values_iter.next()):
for value in values_iter:
print('\t%s' % value)
或者,更紧凑:
for key, values in sorted(d.iteritems()):
print('%s\t%s' % (key, '\n\t'.join(values)))
在Python 3中,您需要.items()
而不是.iteritems()
。