我有以下数据:
df =
QUEUE_1 QUEUE_2 QUEUE_3 HOUR TOTAL_SERVICE_TIME TOTAL_WAIT_TIME
ABC123 DEF656 7 20 30
ABC123 7 22 32
DEF656 ABC123 FED456 8 15 12
FED456 DEF656 8 15 16
我想计算每种类型的TOTAL_SERVICE_TIME
(TOTAL_WAIT_TIME
,QUEUE
,ABC123
)的平均时间DEF656
和FED456
结果应该是这个:
result =
QUEUE HOUR AVG_TOT_SERVICE_TIME AVG_TOT_WAIT_TIME
ABC123 7 21 31
ABC123 8 15 12
DEF656 7 20 30
DEF656 8 15 14
FED456 7 0 0
FED456 8 15 14
这是我目前的代码,但它似乎没有给出预期的结果。特别是,HOUR
的值未排序,TOTAL_SERVICE_TIME
和TOTAL_WAIT_TIME
的平均值未正确计算。
cols = ['QUEUE', 'HOUR', 'TOTAL_SERVICE_TIME', 'TOTAL_WAIT_TIME']
result = pd.melt(
df, ['HOUR', 'TOTAL_SERVICE_TIME', 'TOTAL_WAIT_TIME'],
['QUEUE_1', 'QUEUE_2', 'QUEUE_3'],
value_name='QUEUE')[cols]
答案 0 :(得分:2)
我认为您需要先melt
或lreshape
重塑您的数据:
result = pd.lreshape(df, {'QUEUE': ['QUEUE_1','QUEUE_2','QUEUE_3']})
print (result)
HOUR TOTAL_SERVICE_TIME TOTAL_WAIT_TIME QUEUE
0 7 20 30 ABC123
1 7 22 32 ABC123
2 8 15 12 DEF656
3 8 15 16 FED456
4 7 20 30 DEF656
5 8 15 12 ABC123
6 8 15 16 DEF656
7 8 15 12 FED456
然后groupby
mean
MultiIndex
和unique
QUEUE
HOUR
列mux = pd.MultiIndex.from_product([result.QUEUE.dropna().unique(),
result.dropna().HOUR.unique()], names=['QUEUE','HOUR'])
print (result.groupby(['QUEUE','HOUR'])
.mean()
.reindex(mux, fill_value=0)
.add_prefix('AVG_')
.reset_index())
QUEUE HOUR AVG_TOTAL_SERVICE_TIME AVG_TOTAL_WAIT_TIME
0 ABC123 7 21 31
1 ABC123 8 15 12
2 DEF656 7 20 30
3 DEF656 8 15 14
4 FED456 7 0 0
5 FED456 8 15 14
和A
:
B
答案 1 :(得分:1)
<强> 步骤: 强>
1)使用$scope.getBaseTarif = function () {
var baseTarif = 0;
if (data.pickedOptions.variantA === true && data.pickedOptions.sumInsured === 30000) {
for (var i = 0; i < rates.variantA.sumInsuredThirty.lenght; i++) {
if (data.pickedOptions.days >= rates.variantA.sumInsuredThirty[ i ].dayFrom && data.pickedOptions.days <= rates.variantA.sumInsuredThirty[ i ].dayTo) {
baseTarif = rates.variantA.sumInsuredThirty[ i ].tarif;
return baseTarif;
}
}
}
};
将pd.lreshape
从宽格式转换为长格式,以 QUEUE_X 开头的列名称,并将健康列命名为 QUEUE
2)默认情况下,使用DF
使用DF
作为聚合功能,转移pivot_table
np.mean
。可选择使用0填充缺失值。
3)堆叠获得的DF
,以便强制列作为索引,从而产生多索引格式。添加一个字符前缀并重置它的索引。
df = pd.lreshape(df, {'QUEUE': df.columns[df.columns.str.startswith('QUEUE')].tolist()})
piv_df = df.pivot_table(index=['QUEUE'], columns=['HOUR'], fill_value=0)
piv_df.stack().add_prefix('AVG_').reset_index()