我要做的是将A组中的各个团队分组,并获得每个值显示的时间总数。举个例子 - Team1出现了四次。然后我想将计数值(Team1的四个)除以B列中显示Yes值的次数并获得百分比。
Current
A B C
Team1 Yes 4
Team2 Yes 1
Team1 No 4
Team3 Yes 2
Team1 No 4
Team6 *blank* 1
Team3 No 2
Team1 *blank* 4
Desired
Team1 25%
Team2 100%
Team3 50%
Team6 0%
这是我到目前为止所做的事情,但并未解决如何做到这一点。
import csv
import pandas as pd
import numpy as np
# Select columns from csv file
csv_columns = ['Team, 'Status']
pd.set_option('max_rows', 900)
df = pd.read_csv('test.csv', skipinitialspace=True, usecols=csv_columns)
df['Count'] = df.groupby('Team')['Team'].transform('count')
print(df)
答案 0 :(得分:1)
使用groupby
df.B.eq('Yes').groupby(df.A).mean()
A
Team1 0.25
Team2 1.00
Team3 0.50
Team6 0.00
Name: B, dtype: float64