我目前遇到一个问题,很难解释。我有一个已被分组为4s的数据框。条目的每一行都有一个名为“值”的列。
Name Role Cost Value
0 Johnny Tsunami Driver 1000 39
1 Michael B. Jackson Pistol 2500 46
2 Bobby Zuko Pistol 3000 50
3 Greg Ritcher Lookout 200 25
4 Johnny Tsunami Driver 1000 39
5 Michael B. Jackson Pistol 2500 46
6 Bobby Zuko Pistol 3000 50
7 Appa Derren Lookout 250 30
8 Baby Hitsuo Driver 950 35
9 Michael B. Jackson Pistol 2500 46
10 Bobby Zuko Pistol 3000 50
11 Appa Derren Lookout 250 30
基本上,我希望按每个groupby中值的总和对这些组进行降序排序。
似乎应该很简单。我尝试了很多事情,并遇到了各种错误,例如:sum()not and att属性,str问题,dataframe对象问题。我试过使用sort,sum,lambda,agg函数。我无法相信我在按降序对分组依据进行排序时遇到麻烦。这是一个片段和视觉效果。
groupby基本上对上述数据帧执行此操作:
0
Name Role Cost Value
0 Johnny Tsunami Driver 1000 39
1 Michael B. Jackson Pistol 2500 46
2 Bobby Zuko Pistol 3000 50
3 Greg Ritcher Lookout 200 25
Cost: 6700 Value: 160
1
Name Role Cost Value
4 Johnny Tsunami Driver 1000 39
5 Michael B. Jackson Pistol 2500 46
6 Bobby Zuko Pistol 3000 50
7 Appa Derren Lookout 250 30
Cost: 6750 Value: 165
2
Name Role Cost Value
8 Baby Hitsuo Driver 950 35
9 Michael B. Jackson Pistol 2500 46
10 Bobby Zuko Pistol 3000 50
11 Appa Derren Lookout 250 30
Cost: 6700 Value: 161
排序时,我希望打印数据框和最终结果:
4 Johnny Tsunami Driver 1000 39
5 Michael B. Jackson Pistol 2500 46
6 Bobby Zuko Pistol 3000 50
7 Appa Derren Lookout 250 30
8 Baby Hitsuo Driver 950 35
9 Michael B. Jackson Pistol 2500 46
10 Bobby Zuko Pistol 3000 50
11 Appa Derren Lookout 250 30
0 Johnny Tsunami Driver 1000 39
1 Michael B. Jackson Pistol 2500 46
2 Bobby Zuko Pistol 3000 50
3 Greg Ritcher Lookout 200 25
以下是数据框和代码:
from pprint import pprint
import pandas as pd
import numpy as np
data= [['Johnny Tsunami','Driver',1000,39],
['Michael B. Jackson','Pistol',2500,46],
['Bobby Zuko','Pistol',3000,50],
['Greg Ritcher','Lookout',200,25],
['Johnny Tsunami','Driver',1000,39],
['Michael B. Jackson','Pistol',2500,46],
['Bobby Zuko','Pistol',3000,50],
['Appa Derren','Lookout',250,30],
['Baby Hitsuo','Driver',950,35],
['Michael B. Jackson','Pistol',2500,46],
['Bobby Zuko','Pistol',3000,50],
['Appa Derren','Lookout',250,30]]
df = pd.DataFrame(data,columns=['Name','Role','Cost','Value'])
#groupby4s
gr = df.groupby(np.arange(len(df.index))/4)
答案 0 :(得分:2)
这就是我要做的:
首先创建4个组,对它们进行排序,然后保存索引顺序(更改代码以构建组以使用整数除法)
gr = df.groupby(np.arange(len(df.index.values))//4)
grp_order = (gr.sum()).sort_values('Value', ascending=False).index
然后按正确的顺序打印:
for idx in grp_order:
print(idx)
print(gr.get_group(idx))
print('Cost: ', gr.get_group(idx).Value.sum())
输出:
1
Name Role Cost Value
4 Johnny Tsunami Driver 1000 39
5 Michael B. Jackson Pistol 2500 46
6 Bobby Zuko Pistol 3000 50
7 Appa Derren Lookout 250 30
Cost: 165
2
Name Role Cost Value
8 Baby Hitsuo Driver 950 35
9 Michael B. Jackson Pistol 2500 46
10 Bobby Zuko Pistol 3000 50
11 Appa Derren Lookout 250 30
Cost: 161
0
Name Role Cost Value
0 Johnny Tsunami Driver 1000 39
1 Michael B. Jackson Pistol 2500 46
2 Bobby Zuko Pistol 3000 50
3 Greg Ritcher Lookout 200 25
Cost: 160
答案 1 :(得分:2)
使用if(!$(this).hasClass('expand'))
{
if(expandedEl){
expandedEl.removeClass('expand');
expandedEl.text('EXPAND');
}
$(this).addClass('expand');
$(this).text('COLLAPSE');
expandedEl = $(this);
}
创建附加密钥,然后我们按transform
sort_values
注意,我没有删除我创建的用于排序的键,您可以执行df['key']=df['Value'].groupby(np.arange(len(df))//4).transform('sum')
df=df.sort_values('key',ascending=False)
df
Out[104]:
Name Role Cost Value key
4 Johnny Tsunami Driver 1000 39 165
5 Michael B. Jackson Pistol 2500 46 165
6 Bobby Zuko Pistol 3000 50 165
7 Appa Derren Lookout 250 30 165
8 Baby Hitsuo Driver 950 35 161
9 Michael B. Jackson Pistol 2500 46 161
10 Bobby Zuko Pistol 3000 50 161
11 Appa Derren Lookout 250 30 161
0 Johnny Tsunami Driver 1000 39 160
1 Michael B. Jackson Pistol 2500 46 160
2 Bobby Zuko Pistol 3000 50 160
3 Greg Ritcher Lookout 200 25 160
来删除它。