根据ID和类别创建分组依据列

时间:2020-05-31 22:17:30

标签: python

下面有一个表格,我需要根据ID创建一列“相关”和“不相关”。

该表如下所示:


+----+--------------+--------+
| ID |  Experience  | Length |
+----+--------------+--------+
|  1 | Relevant     |      2 |
|  1 | Non-Relevant |      1 |
|  4 | Relevant     |      3 |
|  4 | Relevant     |      4 |
|  4 | Non-Relevant |      0 |
|  5 | Relevant     |      1 |
|  5 | Relevant     |      1 |
+----+--------------+--------+


这是我想要获得的输出

+----+----------+--------------+
| ID | Relevant | Non-Relevant |
+----+----------+--------------+
|  1 |        2 |            1 |
|  4 |        7 |            0 |
|  5 |        2 |            0 |
+----+----------+--------------+

2 个答案:

答案 0 :(得分:1)

import pandas as pd
df = pd.DataFrame({'id': [1, 1, 4, 4, 4, 5, 5], 'exp': [x for x in 'rnrrnrr'], 'len':[2, 1, 3, 4, 0, 1, 1]})

pd.pivot_table(df, index='id', values='len', columns='exp', aggfunc='sum', fill_value=0)

文档:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html

答案 1 :(得分:1)

要创建数据框:

ID = [1,1,4,4,4,5,5]
Experience = ['Relevant', 'Non-Relevant', 'Relevant', 'Relevant', 'Non-Relevant', 
'Relevant', 'Relevant']
length = [2,1,3,4,0,1,1]

dictionary = {'ID' : ID,
             'Experience' : Experience,
             'Length' : length}

将其分组然后再堆叠:

df.groupby(by=['ID','Experience']).sum().unstack()['Length'].fillna(0)