使用pandas将CSV嵌套到JSON中 - 略微关闭输出

时间:2017-11-19 06:31:51

标签: python json csv

我无法定制this code以满足我的需求。我觉得我很接近,但我不能做空。目标是从csv文件创建嵌套的JSON。我有下面想要的输出,CSV数据和我当前的代码。任何帮助表示赞赏。

当前代码:

import json
import pandas as pd

df = pd.read_csv('txn_data.csv')

def get_nested_rec(key, grp):
    rec = {}
    rec['date'] = key[0]
    rec['name'] = key[1]
    rec['value'] = key[2]

    for field in ['name','value']:
        rec[field] = list(grp[field].unique())

    return rec

records = []
for key, grp in df.groupby(['date']):
    rec = get_nested_rec(key, grp)
    records.append(rec)

records = dict(data = records)

print(json.dumps(records, indent=4))

CSV数据:

date,name,value
1/1/13,Quick Serve,304127
1/1/13,Restaurant,1843286
1/1/13,Retail,239675
1/2/13,Quick Serve,422847
1/2/13,Restaurant,1582848
1/2/13,Retail,394358

JSON的所需输出:

desired_output = [  
   {  
      "date":"2017-01-01",
      "details":[  
         {  
            "name":"Retail",
            "value":9192
         },
         {  
            "name":"Restaurant",
            "value":6753
         },
         {  
            "name":"Quickserve",
            "value":1219
         }
      ]
   },
   {  
      "date":"2017-02-01",
      "details":[  
         {  
            "name":"Retail",
            "value":9192
         },
         {  
            "name":"Restaurant",
            "value":6753
         },
         {  
            "name":"Quickserve",
            "value":1219
         }
      ]
   }
]

我目前得到的内容:

{
    "data": [
        {
            "date": "1", 
            "name": [
                "Automotive", 
                "Durable Goods", 
                "Entertainment", 
                "Food", 
                "Lodging", 
                "Petroleum", 
                "Quick Serve", 
                "Restaurant", 
                "Retail", 
                "Service", 
                "Transportation & Utilities", 
                "Unknown"
            ], 
            "value": [
                91406, 
                9889, 
                172676, 
                358922, 
                63502, 
                1982048, 
                304127, 
                1843286, 
                239675, 
                106462, 
                25924, 
                909
            ]
        }, 
        {
            "date": "1", 
            "name": [
                "Automotive", 
                "Durable Goods", 
                "Entertainment", 
                "Food", 
                "Lodging", 
                "Petroleum", 
                "Quick Serve", 
                "Restaurant", 
                "Retail", 
                "Service", 
                "Transportation & Utilities", 
                "Unknown"
            ], 
            "value": [
                146041, 
                33090, 
                103159, 
                336956, 
                66726, 
                2191346, 
                422847, 
                1582848, 
                394358, 
                339989, 
                49477, 
                494
            ]
        }
    ]
}

2 个答案:

答案 0 :(得分:2)

我会尝试使用更简单的方法解决此任务,如下所示:

import json
import pandas as pd

df = pd.read_csv('test.csv')
l_data = []
data = {}

for key,grp in df.groupby('date'):
    data['date'] = key
    data['details'] = df.loc[df['date'] == key][['name','value']].to_json(orient='records')
    l_data.append(data)

In [32]:
print(json.dumps(l_data))

Out[32]:
[  
   {  
      "date":"1/2/13",
      "details":[  
         {  
            "name":"Quick Serve",
            "value":422847
         },
         {  
            "name":"Restaurant",
            "value":1582848
         },
         {  
            "name":"Retail",
            "value":394358
         }
      ]
   },
   {  
      "date":"1/2/13",
      "details":[  
         {  
            "name":"Quick Serve",
            "value":422847
         },
         {  
            "name":"Restaurant",
            "value":1582848
         },
         {  
            "name":"Retail",
            "value":394358
         }
      ]
   }
]

答案 1 :(得分:1)

我已将您的代码调整为以您请求的格式输出。

import json
import pandas as pd

df = pd.read_csv('txn_data.csv')

def get_nested_rec(key, grp):
    rec = {}
    rec['date'] = key
    rec['details'] = []

    for index, row in grp.iterrows():
        rec['details'].append({
            'name': row['name'],
            'value': row['value']
        })

    return rec

records = []
for key, grp in df.groupby(['date']):
    rec = get_nested_rec(key, grp)
    records.append(rec)

records = dict(data = records)

print(json.dumps(records, indent=4))

以下是结果输出:

{
    "data": [
        {
            "date": "1/1/13", 
            "details": [
                {
                    "name": "Quick Serve", 
                    "value": 304127
                }, 
                {
                    "name": "Restaurant", 
                    "value": 1843286
                }, 
                {
                    "name": "Retail", 
                    "value": 239675
                }
            ]
        }, 
        {
            "date": "1/2/13", 
            "details": [
                {
                    "name": "Quick Serve", 
                    "value": 422847
                }, 
                {
                    "name": "Restaurant", 
                    "value": 1582848
                }, 
                {
                    "name": "Retail", 
                    "value": 394358
                }
            ]
        }
    ]
}