我有以下 x 数据框:
a b c d e f g h
1 1
2 1 2
3 1 2 3 4
4 1 3 2 4
5 2 4 1 3
6 1 2 4 3
x.to_dict()
{'a': {1: '', 2: '', 3: '1', 4: '1', 5: '2', 6: ''},
'b': {1: '1', 2: '', 3: '', 4: '', 5: '', 6: '1'},
'c': {1: '', 2: '1', 3: '2', 4: '3', 5: '4', 6: ''},
'd': {1: '', 2: '', 3: '', 4: '', 5: '', 6: ''},
'e': {1: '', 2: '2', 3: '', 4: '2', 5: '1', 6: '2'},
'f': {1: '', 2: '', 3: '3', 4: '', 5: '', 6: ''},
'g': {1: '', 2: '', 3: '', 4: '4', 5: '3', 6: '4'},
'h': {1: '', 2: '', 3: '4', 4: '', 5: '', 6: '3'}}
我想产生此结果(列值表示列名,列名称表示列值,保留行信息):
1 2 3 4
1 b
2 c e
3 a c f h
4 a e c g
5 e a g c
6 b e h g
请注意, x 中的每一行都不能有重复的值
答案 0 :(得分:0)
另一种方法:
us-est-2
结果:
d = {'a': {1: '', 2: '', 3: '1', 4: '1', 5: '2', 6: ''},
'b': {1: '1', 2: '', 3: '', 4: '', 5: '', 6: '1'},
'c': {1: '', 2: '1', 3: '2', 4: '3', 5: '4', 6: ''},
'd': {1: '', 2: '', 3: '', 4: '', 5: '', 6: ''},
'e': {1: '', 2: '2', 3: '', 4: '2', 5: '1', 6: '2'},
'f': {1: '', 2: '', 3: '3', 4: '', 5: '', 6: ''},
'g': {1: '', 2: '', 3: '', 4: '4', 5: '3', 6: '4'},
'h': {1: '', 2: '', 3: '4', 4: '', 5: '', 6: '3'}}
df = pd.DataFrame(d)
df = df.reset_index().melt(id_vars = 'index').query('value!=""')
pd.crosstab(df['index'], df.value, df.variable, aggfunc = 'max')