熊猫:将数据帧写入json

时间:2016-07-12 21:58:42

标签: python json pandas dataframe resampling

我有数据框:

         date   id
0  12-12-2015  123
1  13-12-2015  123
2  15-12-2015  123
3  16-12-2015  123
4  18-12-2015  123
5  12-12-2015  456
6  13-12-2015  456
7  15-12-2015  456

我需要将date计算到id 我试试df.groupby('id')['date'].count() 我需要得到(如果日期不在id,它等于0)

      id   date   count
0  123   12-12-2015   1
1  123   13-12-2015   1
2  123   14-12-2015   0
3  123   15-12-2015   1
4  123   16-12-2015   1
5  123   17-12-2015   0
6  123   18-12-2015   1
7  456   12-12-2015   1
8  456   13-12-2015   1
9  456   14-12-2015   0
10 456   15-12-2015   1

然后以此格式将其写入json文件

{
"1234567890abcdef1234567890abcdef": {
    "2016-06": 1, 
    "2016-05": 0, 
    "2016-04": 0, 
    "2016-03": 1, 
    "2016-02": 1, 
    "2016-01": 0
}, 
"0987654321abcdef1234567890abcdef": {
    "2016-06": 1, 
    "2016-05": 1, 
    "2016-04": 1, 
    "2016-03": 0, 
    "2016-02": 0, 
    "2016-01": 0
}

}

我该怎么做?

1 个答案:

答案 0 :(得分:1)

首先使用resample

df['date'] = pd.to_datetime(df.date)
df.set_index('date', inplace=True)

df = df.groupby('id').resample('D').size().reset_index(name='val')
print (df)

     id       date  val
0   123 2015-12-12    1
1   123 2015-12-13    1
2   123 2015-12-14    0
3   123 2015-12-15    1
4   123 2015-12-16    1
5   123 2015-12-17    0
6   123 2015-12-18    1
7   456 2015-12-12    1
8   456 2015-12-13    1
9   456 2015-12-14    0
10  456 2015-12-15    1

然后to_json

#remove 00:00:00 from datetime
df['date'] = df.date.dt.date
print (df.groupby('id').apply(lambda x: x.set_index('date')['val'].to_dict()).to_json())

{"123":{"2015-12-18":1,"2015-12-15":1,"2015-12-12":1,"2015-12-16":1,"2015-12-13":1,"2015-12-17":0,"2015-12-14":0},
"456":{"2015-12-15":1,"2015-12-12":1,"2015-12-13":1,"2015-12-14":0}}