pandas groupby 列列出并保留某些值

时间:2021-02-10 18:37:40

标签: python python-3.x pandas group-by pandas-groupby

我有以下数据框:

id       occupations
111      teacher
111      student
222      analyst
333      cook
111      driver
444      lawyer

我创建了一个包含所有职业列表的新列:

new_df['occupation_list'] = df['id'].map(df.groupby('id')['occupations'].agg(list))

如何只在 teacher 中包含 studentoccupation_list 值?

3 个答案:

答案 0 :(得分:1)

您可以在 groupby 之前过滤:

to_map = (df[df['occupations'].isin(['teacher', 'student'])]
             .groupby('id')['occupations'].agg(list)
         )

df['occupation_list'] = df['id'].map(to_map)

输出:

    id occupations     occupation_list
0  111     teacher  [teacher, student]
1  111     student  [teacher, student]
2  222     analyst                 NaN
3  333        cook                 NaN
4  111      driver  [teacher, student]
5  444      lawyer                 NaN

答案 1 :(得分:0)

你也可以

df.groupby('id')['occupations'].transform(' '.join).str.split()

答案 2 :(得分:0)

您只需执行 groupby 并将列聚合到列表中即可:

df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})

出:

>>> df
    id occupations
0  111     teacher
1  111     student
2  222     analyst
3  333        cook
4  111      driver
5  444      lawyer
>>> df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})
    id                 occupations
0  111  [teacher, student, driver]
1  222                   [analyst]
2  333                      [cook]
3  444                    [lawyer]
相关问题