熊猫value_counts()并保留它

时间:2020-06-12 03:38:32

标签: pandas

我想用.value_counts()计算值并保留它。 例如

df = pd.DataFrame({ 'fruit':['apples']*3  + ['peaches']*5 + ['bananas']*3 +
                            ['carrots']*4 + ['apricots']*10 })

print (df)

    fruit
0   apples
1   apples
2   apples
3   peaches
4   peaches

df["fruit"].value_counts()

apricots    10
peaches      5
carrots      4
apples       3
bananas      3
Name: fruit, dtype: int64

并且我想将值对齐,

print(df)

    fruit
0   apricots
1   apricots
2   apricots
3   apricots
4   apricots

我该怎么做?

plz,请给我一些您的想法 谢谢!

3 个答案:

答案 0 :(得分:2)

我用groupbytransform一句话来做。

df['count'] = df.groupby('fruit')['fruit'].transform('count')

答案 1 :(得分:0)

第1步:制作价值计数表

vc = df['fruit'].value_counts().reset_index()
      index  fruit
0  apricots     10
1   peaches      5
2   carrots      4
3    apples      3
4   bananas      3

第2步:合并

f = df.merge(vc, how='left', left_on='fruit', right_on='index')
     fruit_x     index  fruit_y
0     apples    apples        3
1     apples    apples        3
2     apples    apples        3
3    peaches   peaches        5
4    peaches   peaches        5
5    peaches   peaches        5
6    peaches   peaches        5
7    peaches   peaches        5
8    bananas   bananas        3
9    bananas   bananas        3
10   bananas   bananas        3
11   carrots   carrots        4
12   carrots   carrots        4
13   carrots   carrots        4
14   carrots   carrots        4
15  apricots  apricots       10
16  apricots  apricots       10
17  apricots  apricots       10
18  apricots  apricots       10
19  apricots  apricots       10
20  apricots  apricots       10
21  apricots  apricots       10
22  apricots  apricots       10
23  apricots  apricots       10
24  apricots  apricots       10

第3步:进行一些清理

f = f.drop('index', axis=1).rename({'fruit_x': 'fruit', 'fruit_y': 'count'}, axis=1)
       fruit  count
0     apples      3
1     apples      3
2     apples      3
3    peaches      5
4    peaches      5
5    peaches      5
6    peaches      5
7    peaches      5
8    bananas      3
9    bananas      3
10   bananas      3
11   carrots      4
12   carrots      4
13   carrots      4
14   carrots      4
15  apricots     10
16  apricots     10
17  apricots     10
18  apricots     10
19  apricots     10
20  apricots     10
21  apricots     10
22  apricots     10
23  apricots     10
24  apricots     10

答案 2 :(得分:0)

IIUC,将pandas.Series.value_countspd.Index.repeat一起使用:

s = df["fruit"].value_counts()
df["fruit"] = s.index.repeat(s)
print(df)

输出:

       fruit
0   apricots
1   apricots
2   apricots
3   apricots
...
21   bananas
22    apples
23    apples
24    apples