按组格式化数据框输出到JSON记录

时间:2018-01-01 09:38:19

标签: python json pandas dataframe group-by

我的数据框看起来像 <SeekBar android:id="@+id/sb" android:layout_width="match_parent" android:layout_height="wrap_content" android:max="10" android:maxHeight="10dp" android:minHeight="10dp" android:progressDrawable="@drawable/progress_drawable"/>

df

从psql数据库获取数据,现在我想以json格式生成 count_arena_users count_users event timestamp 0 4458 12499 football 2017-04-30 1 2706 4605 cricket 2015-06-30 2 592 4176 tennis 2016-06-30 3 3427 10126 badminton 2017-05-31 4 717 2313 football 2016-03-31 5 101 155 hockey 2016-01-31 6 45923 191180 tennis 2015-12-31 7 1208 2824 badminton 2017-01-31 8 5577 8906 cricket 2016-02-29 9 111 205 football 2016-03-31 10 4 8 hockey 2017-09-30 的输出。但是所需的json格式必须是这样的:

"select * from tbl_arena"

根据[ { "event": "football", "data_to_plot": [ { "count_arena_users": 717, "count_users": 2313, "timestamp": "2016-03-31" }, { "count_arena_users": 111, "count_users": 205, "timestamp": "2016-03-31" }, { "count_arena_users": 4458, "count_users": 12499, "timestamp": "2017-04-30" } ] }, { "event": "cricket", "data_to_plot": [ { "count_arena_users": 2706, "count_users": 4605, "timestamp": "2015-06-30" }, { "count_arena_users": 5577, "count_users": 8906, "timestamp": "2016-02-29" } ] } . . . . ] 列对所有列的值进行分组,之后根据event列确定子词典的出现顺序,即先出现的早期日期和出现的新/最新日期低于它。

我使用python 3.x和json.dumps将数据格式化为json样式。

1 个答案:

答案 0 :(得分:1)

高级别流程如下 -

  1. 汇总与events相关的所有数据。我们需要groupby + apply
  2. 将结果转换为一系列记录,每个事件和相关数据的一条记录。使用to_jsonorient=records一起使用df.groupby('event', sort=False)\ .apply(lambda x: x.drop('event', 1).sort_values('timestamp').to_dict('r'))\ .reset_index(name='data_to_plot')\ .to_json(orient='records')
  3. [
      {
        "event": "football",
        "data_to_plot": [
          {
            "count_arena_users": 717,
            "timestamp": "2016-03-31",
            "count_users": 2313
          },
          {
            "count_arena_users": 111,
            "timestamp": "2016-03-31",
            "count_users": 205
          },
          {
            "count_arena_users": 4458,
            "timestamp": "2017-04-30",
            "count_users": 12499
          }
        ]
      },
      ...
    ]
    

    package.json