python脚本按行连接值并删除相同的

时间:2017-08-22 16:35:16

标签: python python-2.7 file

我使用的是python 2.7,我有一个看起来像这样的文本文件:

id     value
---    ----
1      x
2      a
1      z
1      y
2      b

我正试图获得一个看起来像这样的输出:

id     value
---    ----
1      x,z,y
2      a,b

非常感谢!

3 个答案:

答案 0 :(得分:2)

最简单的解决方案是使用collections.defaultdictcollections.OrderedDict。如果您不关心订单,也可以使用set代替OrderedDict

from collections import defaultdict, OrderedDict

# Keeps all unique values for each id
dd = defaultdict(OrderedDict)
# Keeps the unique ids in order of appearance
ids = OrderedDict()

with open(yourfilename) as f:
    f = iter(f)
    # skip first two lines
    next(f), next(f)  
    for line in f:
        id_, value = list(filter(bool, line.split()))  # split at whitespace and remove empty ones
        dd[id_][value] = None  # dicts need a value, but here it doesn't matter which one...
        ids[id_] = None

print('id     value')
print('---    ----')
for id_ in ids:
    print('{}      {}'.format(id_, ','.join(dd[id_])))

结果:

id     value
---    ----
1      x,z,y
2      a,b

如果您想将其写入另一个文件,只需将我用\nwrite打印的内容连接到文件。

答案 1 :(得分:1)

我认为这也可行,但另一个答案看起来更复杂:

input =['1,x',
'2,a',
'1,z',
'1,y',
'2,b',
'2,a', #added extra values to show duplicates won't be added
'1,z',
'1,y']

output = {}

for row in input:
    parts = row.split(",")
    id_ = parts[0]
    value = parts[1]
    if id_ not in output:
        output[id_] = value
    else:
        a_List = list(output[id_])
        if value not in a_List:
            output[id_] += "," + value
        else:
            pass

您最终得到的字典与您要求的字典类似。

答案 2 :(得分:0)

#read
fp=open('','r') 
d=fp.read().split("\n")
fp.close()
x=len(d)
for i in range(len(d)):
    n= d[i].split()
    d.append(n)
d=d[x:]
m={}
for i in d:
    if i[0] not in m:
        m[i[0]]=[i[1]]
    else:
        if i[1] not in m[i[0]]:
            m[i[0]].append(i[1])
for i in m:
    print i,",".join(m[i])