Pandas Dataframe:如何在其他列中添加具有出现次数的列

时间:2016-05-06 17:06:53

标签: python pandas pandas-groupby

我必须关注df:

Col1    Col2
test    Something
test2   Something
test3   Something
test    Something
test2   Something
test5   Something

我想要

Col1    Col2          Occur
test    Something     2
test2   Something     2
test3   Something     1
test    Something     2
test2   Something     2
test5   Something     1

我试过用:

df["Occur"] = df["Col1"].value_counts()

但它没有帮助。我的Occur专栏中充满了'NaN'

3 个答案:

答案 0 :(得分:2)

groupby on' col1'然后在Col2上应用transform以返回其索引与原始df对齐的系列,以便您可以将其添加为列:

In [3]:
df['Occur'] = df.groupby('Col1')['Col2'].transform(pd.Series.value_counts)
df

Out[3]:
    Col1       Col2 Occur
0   test  Something     2
1  test2  Something     2
2  test3  Something     1
3   test  Something     2
4  test2  Something     2
5  test5  Something     1

答案 1 :(得分:1)

您也可以将GroupBytransformsize一起使用:

df['Occur'] = df.groupby('Col1')['Col1'].transform('size')

print(df)

    Col1       Col2  Occur
0   test  Something      2
1  test2  Something      2
2  test3  Something      1
3   test  Something      2
4  test2  Something      2
5  test5  Something      1

答案 2 :(得分:-1)

当我想保留更多的列而不是两列Col1和Col2时,我无法获得其他答案。下面对我来说很好,保留了任意数量的其他列。

df['Occur'] = df['Col1'].apply(lambda x: (df['Col1'] == x).sum())