Question

我很难理解为什么我的代码无法按预期工作。

我有一个数据名人堂： Screenshot to dataframe （抱歉，我没有很高的声誉来发布图片）

我汇总如下以得出testBytes的总和：

aggregation = {'testBytes' : ['sum']}
tests_DL_groupped = tests_DL_short.groupby(['measDay','_p_live','_p_compositeId','Latitude','Longitude','testType']).agg(aggregation).reset_index()

现在实际的问题是为什么此代码无法按预期方式生成NaN ：

tests_DL_groupped.loc[:,'testMBytes'] = tests_DL_groupped['testBytes']/1000/1000

a not working

在此工作正常时：

tests_DL_groupped['testMBytes'] = tests_DL_groupped['testBytes']/1000/1000

a working

这应该是首选的熊猫方式...

非常感谢您！

Answer 1

列中有问题MultiIndex。

解决方案正在改变：

aggregation = {'testBytes' : ['sum']}

收件人：

aggregation = {'testBytes' : 'sum'}

为避免这种情况。

或使用GroupBy.sum：

cols = ['measDay','_p_live','_p_compositeId','Latitude','Longitude','testType']
tests_DL_groupped = tests_DL_short.groupby(cols)['testBytes'].sum().reset_index()

tests_DL_groupped = tests_DL_short.groupby(cols, as_index=False)['testBytes'].sum()

通过.loc从聚合中添加新列将返回NaN

1 个答案: