从按唯一ID分组的多个类别中汇总(计算)特定类别

时间:2018-07-25 19:53:51

标签: pandas python-3.6 calculation

作为我上一个问题的跟进, Best way(run-time) to aggregate (calculate ratio of) sum to total count based on group by (感谢@jezrael)

我的另一列有4种不同的状态-例如1,2,3,4

我现在正在尝试为每个ID查找1的比率。

@Override
public void onBindViewHolder(MyViewHolder holder, int position) {
    YoYo.with(Techniques.FadeIn).playOn(holder.cardView);
    start= holder.Description.getText().toString().substring(0,20);
    FeedItem current=feedItems.get(position);
    holder.Title.setText(current.getTitle());
    holder.Description.setText(start);
    holder.Date.setText(current.getPubDate());
    Picasso.with(context).load(current.getThumbnailUrl()).into(holder.Thumbnail);

}

所需的输出:

发现每个ID的比率为1

import boto3

instance_id=("i-0e2bbdf4fc43bf6db")

session = boto3.Session("ec2",region_name="us-west-2")
ec2 = session.resource("ec2")

OR

ec2 = boto3.resource('ec2', region_name='us-west-2')

ec2.create_instances(ImageId="ami-9d623ee5",MinCount=1,MaxCount=1)

2 个答案:

答案 0 :(得分:3)

您可以使用

 df.groupby('Cust_ID')['STATUS'].apply(lambda x: (x == 1).mean())

输出:

Cust_ID
a    0.666667
b    0.333333
c    0.000000
d    1.000000
Name: STATUS, dtype: float64

答案 1 :(得分:3)

mean(==)创建的布尔掩码eq用于一列DataFrame

df1 = df['STATUS'].eq(1).groupby(df['Cust_ID']).mean().to_frame()
#alternative
#df1 = (df['STATUS'] == 1).groupby(df['Cust_ID']).mean().to_frame()
print (df1)
           STATUS
Cust_ID          
a        0.666667
b        0.333333
c        0.000000
d        1.000000

第2列df的:

df1 = df['STATUS'].eq(1).groupby(df['Cust_ID']).mean().reset_index()
#alternative
#df1 = (df['STATUS'] == 1).groupby(df['Cust_ID']).mean().reset_index()
print (df1)
  Cust_ID    STATUS
0       a  0.666667
1       b  0.333333
2       c  0.000000
3       d  1.000000