我有一个IPL数据集,如下所示:
def boys_and_girls(boys_count, girls_count):
print "In your school there are %d boys." % boys_count
print "In your school there are %d girs." % girls_count
print "Total number of students in the school is %d." % (boys_count + girls_count)
print "That's a lot of students!\n"
print "How many boys on the school?"
boys = raw_input(">")
print "How many girls in the school?"
girls = raw_input(">")
boys_and_girls(boys, girls)
我想根据每支球队赢得比赛的次数以及赢得比赛后他们赢得比赛的次数对数据进行分组。
例如,所需的输出是:
df.head(10):
toss_winner winner
0 Royal Challengers Bangalore Sunrisers Hyderabad
1 Rising Pune Supergiant Rising Pune Supergiant
2 Kolkata Knight Riders Kolkata Knight Riders
3 Kings XI Punjab Kings XI Punjab
4 Royal Challengers Bangalore Royal Challengers Bangalore
5 Sunrisers Hyderabad Sunrisers Hyderabad
6 Mumbai Indians Mumbai Indians
7 Royal Challengers Bangalore Kings XI Punjab
8 Rising Pune Supergiant Delhi Daredevils
9 Mumbai Indians Mumbai Indians
我尝试了groupby和aggregation的变体,但是似乎没有任何作用
答案 0 :(得分:0)
先尝试melt
,然后尝试groupby
和unstack
s = pd.melt(df).groupby('value')['variable'].value_counts().unstack('variable')\
.fillna(0)
print(s)
variable toss_winner winner
value
Delhi Daredevils 0.0 1.0
Kings XI Punjab 1.0 2.0
Kolkata Knight Riders 1.0 1.0
Mumbai Indians 2.0 2.0
Rising Pune Supergiant 2.0 1.0
Royal Challengers Bangalore 3.0 1.0
Sunrisers Hyderabad 1.0 2.0
答案 1 :(得分:0)
这是了解每个步骤的简单方法:
# number of counts each team win the toss
a = df.groupby("toss_winner").size()
# number of times they win the match after winning the toss
b = df.query("toss_winner == winner").groupby(["toss_winner"]).size()
# output
f = pd.concat([a, b], axis=1).reset_index().rename(columns={0: 'total_toss_win', 1: 'win_on_toss_win'})
print(f)
toss_winner total_toss_win win_on_toss_win
0 Kings XI Punjab 1 1
1 Kolkata Knight Riders 1 1
2 Mumbai Indians 2 2
3 Rising Pune Supergiant 2 1
4 Royal Challengers Bangalore 3 1
5 Sunrisers Hyderabad 1 1