pandas dataframe groupby:只有正数的总和/数

时间:2013-12-06 19:09:08

标签: python pandas

我有一个数据框('框架'),我想按国家和日期聚合:

aggregated=pd.DataFrame(frame.groupby(['Country','Date']).CaseID.count())

aggregated["Total duration"]=frame.groupby(['Country','Date']).Hours.sum()

aggregated["Mean duration"]=frame.groupby(['Country','Date']).Hours.mean()

我想计算上述数字(总持续时间,平均持续时间等)仅针对“框架”中的正“小时”数字。我怎么能这样做?

谢谢!

示例“框架”

import pandas as pd
Line1 = {"Country": "USA", "Date":"01 jan", "Hours":4}
Line2 = {"Country": "USA", "Date":"01 jan", "Hours":3}
Line3 = {"Country": "USA", "Date":"01 jan", "Hours":-999}
Line4 = {"Country": "Japan", "Date":"01 jan", "Hours":3}
pd.DataFrame([Line1,Line2,Line3,Line4])

2 个答案:

答案 0 :(得分:7)

不像上面那样优雅,但在某些角落情况下处理不同。 df代表原始问题的frame

>>> df.groupby(['Country','Date']).agg(lambda x: x[x>0].mean())
                Hours
Country Date
Japan   01 jan    3.0
USA     01 jan    3.5
>>> df.ix[3, 'Hours'] = -1
>>> df.groupby(['Country','Date']).agg(lambda x: x[x>0].mean())
                Hours
Country Date
Japan   01 jan    NaN
USA     01 jan    3.5

答案 1 :(得分:6)

怎么样 -

frame[frame["Hours"] > 0].groupby(['Country','Date'])