将数据框转换为JSON

时间:2019-03-28 16:18:17

标签: python pandas

我有以下DataFrame:

                                    price
item_name            timestamp
item1                2018-10-12     12.2
                     2018-10-13     14.3
                     2018-10-14     17.1
item2                2018-10-12     11.4
                     2018-10-13     15.6
                     2018-10-14     17.2
item2                2018-10-12     11.5
                     2018-10-13     17.2
                     2018-10-14     17.2

我想将其转换为以下格式的JSON:

{
   "item1":{
      "1539302400000": 12.2,
      "1539388800000": 14.3,
      "1539475200000": 17.1,
   },
   "item2":{
      "1539302400000":11.4,
      "1539388800000":15.6,
      "1539475200000":17.2,
   },
   "item3":{
      "1539302400000":11.5,
      "1539388800000":17.2,
      "1539475200000":17.2,
   }
}

或:

{
   "1539302400000":{
      "item1": 12.2,
      "item2": 14.3,
      "item3": 17.1,
   },
   "1539388800000":{
      "item1":11.4,
      "item2":15.6,
      "item3":17.2,
   },
   "1539475200000":{
      "item1":11.5,
      "item2":17.2,
      "item3":17.2,
   }
}

但是,我无法获得所需格式的JSON。

df.reset_index().to_json(orient='records')给了我这个:

[
   {
      "item_name":"item1",
      "timestamp":1539302400000,
      "price":12.2
   },
   {
      "item_name":"item1",
      "timestamp":1539388800000,
      "price":14.3
   },
   {
      "item_name":"item1",
      "timestamp":1539475200000,
      "price":17.1
   },
   {
      "item_name":"item2",
      "timestamp":1539302400000,
      "price":11.4
   },
   {
      "item_name":"item2",
      "timestamp":1539388800000,
      "price":15.6
   },
   {
      "item_name":"item2",
      "timestamp":1539475200000,
      "price":17.2
   },
]

我还尝试为orient属性使用不同的值,但没有一个起作用。我不确定这是否可行,但是如果可以的话,任何人都可以给我一个提示来实现该目标吗?

1 个答案:

答案 0 :(得分:1)

基于您的数据框(列和索引)

import pandas as pd
import json

df = pd.DataFrame( data = [
    ('item1', '2018-10-12', 12.2),
    ('item1', '2018-10-13', 14.3),
    ('item1', '2018-10-14', 17.1),
    ('item2', '2018-10-12', 11.4),
    ('item2', '2018-10-13', 15.6),
    ('item2', '2018-10-14', 17.2),
    ('item3', '2018-10-12', 11.5),
    ('item3', '2018-10-13', 17.2),
    ('item3', '2018-10-14', 17.2)  
],columns=['item_name','timestamp','price'])

df = df.set_index(['item_name','timestamp'])

您可以自己创建最复杂的JSON对象,问题在于,如果它太大,它会变得非常低效且缓慢。

data = {};
for index,row in df.iterrows():
    if not index[0] in data:
        data[index[0]] = {}
    data[index[0]][ index[1] ] = row['price']

print(json.dumps(data))
  

输出

{  
   "item2":{  
      "2018-10-13":15.6,
      "2018-10-12":11.4,
      "2018-10-14":17.2
   },
   "item3":{  
      "2018-10-13":17.2,
      "2018-10-12":11.5,
      "2018-10-14":17.2
   },
   "item1":{  
      "2018-10-13":14.3,
      "2018-10-12":12.2,
      "2018-10-14":17.1
   }
}

很显然,在此过程中,您可以根据需要更改日期的格式