我要更改格式如下的文件:
1182659 Sample05 22
1182659 Sample33 14
4758741 Sample05 74
4758741 Sample33 2
3652147 Sample05 8
3652147 Sample33 34
对此:
Sample05 Sample33
1182659 22 14
4758741 74 2
3652147 8 34
我看到的一种方法是使用双索引字典,但是我想知道在进入之前是否有更简单的方法。
答案 0 :(得分:1)
没有pandas
,但是来自groupby
的{{1}}:
itertools
打印:
from itertools import groupby
data = """
1182659 Sample05 22
1182659 Sample33 14
4758741 Sample05 74
4758741 Sample33 2
3652147 Sample05 8
3652147 Sample33 34
"""
groups = groupby((line.split() for line in data.splitlines() if line), key=lambda v: v[0])
rows = []
headers = []
for g, v in groups:
v = list(v)
for i in v:
if i[1] not in headers:
headers.append(i[1])
rows.append([g] + [i[-1] for i in v])
print('\t'+ '\t'.join(headers))
for row in rows:
for value in row:
print(value, end='\t')
print()
答案 1 :(得分:0)
使用pandas:
import pandas as pd
# if the delimeter is a space
df = pd.read_csv("<path to file>.txt", sep=" ", header=None)
df.set_index([0, 1])[2].unstack()
输出:
1 Sample05 Sample33
0
1182659 22 14
3652147 8 34
4758741 74 2