csv到复杂的嵌套json

时间:2020-11-09 07:49:29

标签: python json pandas csv

所以,我有一个巨大的CSV文件,看起来像:

PN,PCA Code,MPN Code,DATE_CODE,Supplier Code,CM Code,Fiscal YEAR,Fiscal MONTH,Usage,Defects
13-1668-01,73-2590,MPN148,1639,S125,CM1,2017,5,65388,0
20-0127-02,73-2171,MPN170,1707,S125,CM1,2017,9,11895,0
19-2472-01,73-2302,MPN24,1711,S119,CM1,2017,10,4479,0
20-0127-02,73-2169,MPN170,1706,S125,CM1,2017,9,7322,0
20-0127-02,73-2296,MPN170,1822,S125,CM1,2018,12,180193,0
15-14399-01,73-2590,MPN195,1739,S133,CM6,2018,11,1290,0

我想做的是按PCA代码对所有数据进行分组。因此,PCA代码的零件编号一定,这些零件将由某些MPN代码制成,而我想要的最终嵌套JSON结构如下所示:

[
    {
        PCA: {
            "code": "73-2590",
            "CM": ["CM1", "CM6"],
            "parts": [
                {
                    "number": "13-1668-01",
                    "manufacturer": [
                        {
                            "id": "MPN148"
                            "info": [
                                {
                                    "date_code": 1639,
                                    "supplier": {
                                        "id": "S125",
                                        "FYFM": "2020-9",
                                        "usage": 65388,
                                        "defects": 0,
                                    }
                                }
                            ]
                        },
                    ]
                }
            ]
        }
    }
]

因此,我希望这种结构用于具有不同MPN和不同日期代码等的多个部件号(PN)。

我目前正在使用Pandas来执行此操作,但是我对如何继续嵌套感到困惑。

到目前为止,我的代码:

import json
import pandas as pd

dataframe = pd.read_csv('files/dppm_wc.csv')

data = {'PCAs': []}

for key, group in dataframe.groupby('PCA Code'):
    for index, row in group.itterrows():
        temp_dict = {'PCA Code': key, 'CM Code': row['CM Code'], 'parts': []}

with open('output.txt', 'w') as file:
    file.write(json.dumps(data, indent=4))

如何继续实现所需的嵌套JSON格式?有比我正在做的更好的方法吗?

1 个答案:

答案 0 :(得分:0)

我不太了解您希望如何使用该结构,但是我想可以通过这样的方法来实现

data = {'PCAs': []}

for key, group in df.groupby('PCA Code'):
    temp_dict = {'PCA Code': key, 'CM Code': [], 'parts': []}
    for index, row in group.iterrows():
        temp_dict['CM Code'].append(row['CM Code'])
        temp_dict['parts'].append(
            {'number': row['PN'], 
             'manufacturer': [
                 {
                     'id': row['MPN Code'], 
                     'info': [
                         {
                             'date_code': row['DATE_CODE'], 
                             'supplier': {'id': row['Supplier Code'], 
                                          'FYFM': '%s-%s' % (row['Fiscal YEAR'], row['Fiscal MONTH']), 
                                          'usage': row['Usage'], 
                                          'defects': row['Defects']}
                         }
                     ]
                 }]
             }
        )
    data['PCAs'].append(temp_dict)