将每日数据转换为每周均值和中位数

时间:2016-10-13 07:37:10

标签: python date pandas mean median

我有一个像这样的词典列表:

[
    {'2016-06-11': 10, 
     '2016-06-09': 10, 
     'ID': 1, 
     '2016-06-04': 10,
     '2016-06-07': 10,
     '2016-06-06': 10,
     '2016-06-01': 10,
     '2016-06-03': 10,
     'type': 'primary',
     '2016-06-05': 10,
     '2016-06-10': 10,
     '2016-06-02': 10,
     '2016-06-08': 10}, 
    {'2016-06-11': 2,
     '2016-06-09': 1,
     'ID': 2,
     'type': 'secondary',
     '2016-06-04': 1,
     '2016-06-07': 1,
     '2016-06-06': 1,
     '2016-06-01': 1,
     '2016-06-03': 1,
     '2016-06-05': 1,
     '2016-06-10': 2,
     '2016-06-02': 1,
     '2016-06-08': 1}
]

我需要将其转换为类似的dicts列表,其中键将是周(从周一开始,例如2016-06-03 - 2016-06-09)或几个月(例如2016-06),值将是是该周/月的值的平均值或中位数。最简单的方法是什么?

1 个答案:

答案 0 :(得分:1)

我认为您可以months resample汇总meanmedian,最后list创建dict DataFrame.to_dict }:

df = pd.DataFrame(d)
print (df)
   2016-06-01  2016-06-02  2016-06-03  2016-06-04  2016-06-05  2016-06-06  \
0          10          10          10          10          10          10   
1           1           1           1           1           1           1   

   2016-06-07  2016-06-08  2016-06-09  2016-06-10  2016-06-11  ID       type  
0          10          10          10          10          10   1    primary  
1           1           1           1           2           2   2  secondary

df.set_index(['type', 'ID'], inplace=True)
df.columns = pd.to_datetime(df.columns)
df = df.T.resample('M').mean()
df.index = df.index.strftime('%Y-%m')
print (df)
type    primary secondary
ID            1         2
2016-06    10.0  1.181818

print (df.T.reset_index().to_dict(orient='records'))
[{'type': 'primary', '2016-06': 10.0, 'ID': 1}, 
 {'type': 'secondary', '2016-06': 1.1818181818181819, 'ID': 2}]
df.set_index(['type', 'ID'], inplace=True)
df.columns = pd.to_datetime(df.columns)
df = df.T.resample('M').median()
df.index = df.index.strftime('%Y-%m')
print (df)
type    primary secondary
ID            1         2
2016-06      10         1

print (df.T.reset_index().to_dict(orient='records'))
[{'type': 'primary', '2016-06': 10, 'ID': 1}, 
 {'type': 'secondary', '2016-06': 1, 'ID': 2}]

reample的另一个解决方案groupbyDatetimeIndex.to_period创建的月份期间为https://www.npmjs.com/package/grunt-protractor-coverage

df = df.groupby([df.index.to_period('m')]).mean()
df = df.groupby([df.index.to_period('m')]).median()