使用嵌套的列名称将嵌套的json展平为csv

时间:2019-04-23 08:57:45

标签: python pandas

我现在有非常奇怪的要求。我在json下方,并且以某种方式必须将其转换为平面csv。

[
  {
    "authorizationQualifier": "SDA",
    "authorizationInformation": "          ",
    "securityQualifier": "ASD",
    "securityInformation": "          ",
    "senderQualifier": "ASDAD",
    "senderId": "FADA      ",
    "receiverQualifier": "ADSAS",
    "receiverId": "ADAD           ",
    "date": "140101",
    "time": "0730",
    "standardsId": null,
    "version": "00501",
    "interchangeControlNumber": "123456789",
    "acknowledgmentRequested": "0",
    "testIndicator": "T",
    "functionalGroups": [
      {
        "functionalIdentifierCode": "ADSAD",
        "applicationSenderCode": "ASDAD",
        "applicationReceiverCode": "ADSADS",
        "date": "20140101",
        "time": "07294900",
        "groupControlNumber": "123456789",
        "responsibleAgencyCode": "X",
        "version": "005010X221A1",
        "transactions": [
          {
            "name": "ASDADAD",
            "transactionSetIdentifierCode": "adADS",
            "transactionSetControlNumber": "123456789",
            "implementationConventionReference": null,
            "segments": [
              {
                "BPR03": "ad",
                "BPR14": "QWQWDQ",
                "BPR02": "1.57",
                "BPR13": "23223",
                "BPR01": "sad",
                "BPR12": "56",
                "BPR10": "32424",
                "BPR09": "12313",
                "BPR08": "DA",
                "BPR07": "123456789",
                "BPR06": "12313",
                "BPR05": "ASDADSAD",
                "BPR16": "21313",
                "BPR04": "SDADSAS",
                "BPR15": "11212",
                "id": "aDSASD"
              },
              {
                "TRN02": "2424",
                "TRN03": "35435345",
                "TRN01": "3435345",
                "id": "FSDF"
              },
              {
                "REF02": "fdsffs",
                "REF01": "sfsfs",
                "id": "fsfdsfd"
              },
              {
                "DTM02": "2432424",
                "id": "sfsfd",
                "DTM01": "234243"
              }
            ],
            "loops": [
              {
                "id": "24324234234",
                "segments": [
                  {
                    "N101": "sfsfsdf",
                    "N102": "sfsf",
                    "id": "dgfdgf"
                  },
                  {
                    "N301": "sfdssfdsfsf",
                    "N302": "effdssf",
                    "id": "fdssf"
                  },
                  {
                    "N401": "sdffssf",
                    "id": "sfds",
                    "N402": "sfdsf",
                    "N403": "23424"
                  },
                  {
                    "PER06": "Wsfsfdsfsf",
                    "PER05": "sfsf",
                    "PER04": "23424",
                    "PER03": "fdfbvcb",
                    "PER02": "Pedsdsf",
                    "PER01": "sfsfsf",
                    "id": "fdsdf"
                  }
                ]
              },
              {
                "id": "2342",
                "segments": [
                  {
                    "N101": "sdfsfds",
                    "N102": "vcbvcb",
                    "N103": "dsfsdfs",
                    "N104": "343443",
                    "id": "fdgfdg"
                  },
                  {
                    "N401": "dfsgdfg",
                    "id": "dfgdgdf",
                    "N402": "dgdgdg",
                    "N403": "234244"
                  },
                  {
                    "REF02": "23423342",
                    "REF01": "fsdfs",
                    "id": "sfdsfds"
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

与更深的键值对应的列标题名称采用嵌套形式,例如functionalGroups[0].transactions[0].segments[0].BPR15

我可以在一行中使用this github project在Java中执行此操作(在这里您可以找到我想要的输出格式):

flatJson = JSONFlattener.parseJson(new File("files/simple.json"), "UTF-8");

输出为:

date,securityQualifier,testIndicator,functionalGroups[1].functionalIdentifierCode,functionalGroups[1].date,functionalGroups[1].applicationReceiverCode, ...
140101,00,T,HP,20140101,ETIN,...

但是我想在python中做到这一点。我按照this答案中的建议进行了尝试:

with open('data.json') as data_file:
    data = json.load(data_file)
df = json_normalize(data, record_prefix=True)

with open('temp2.csv', "w", newline='\n') as csv_file:
    csv_file.write(df.to_csv())

但是,对于列functionalGroups,它会将json转储为单元格值。

我也按照this answer中的建议进行了尝试:

with open('data.json') as f:  # this ensures opening and closing file
    a = json.loads(f.read())

df = pandas.DataFrame(a)

print(df.transpose())

但这似乎也可以做到这一点:

                                                                          0
acknowledgmentRequested                                                   0
authorizationInformation                                                   
authorizationQualifier                                                  SDA
date                                                                 140101
functionalGroups          [{'functionalIdentifierCode': 'ADSAD', 'applic...
interchangeControlNumber                                          123456789
receiverId                                                  ADAD           
receiverQualifier                                                     ADSAS
securityInformation                                                        
securityQualifier                                                       ASD
senderId                                                         FADA      
senderQualifier                                                       ASDAD
standardsId                                                            None
testIndicator                                                             T
time                                                                   0730
version                                                               00501

是否可以在python中做我想做的事?

0 个答案:

没有答案