Pandas DataFrame到部分嵌套的JSON

时间:2018-11-07 22:25:21

标签: python json pandas dataframe

我有一个类似于this one的问题。但是,我需要部分嵌套我的JSON。目前,我的数据框如下所示:

df = pd.DataFrame({'subsidary': ['company name','company name'],
                   'purchase_order_number': ['PO Num', 'PO Num'],
                   'invoice_date': ['2018-10-15', '2018-10-15'],
                   'vendor_invoice_number': ['777','777'],
                   'vendor_sku': ['SKU888', 'SKU888'],
                   'quantity': ['10', '20'],
                   'rate': ['12.00', '11.00'],
                   'amount': ['120.00', '220.00'],
                   'freight': ['5.00', '5.00'],
                   'taxes': ['0.00', '0.00']})

使用上面的链接和下面的代码:

j = (df.groupby(['subsidary','purchase_order_number','invoice_date','vendor_invoice_number'], as_index=False)
           .apply(lambda x: x[['vendor_sku','quantity','rate','amount']].to_dict('r'))
           .reset_index()
           .rename(columns={0:'item_charges'})   
           .to_json(orient='records'))

print(json.dumps(json.loads(j), indent=2, sort_keys=False))

我能够使它看起来像这样:

[
  {
    "subsidary": "company name",
    "purchase_order_number": "PO Num",
    "invoice_date": "2018-10-15",
    "vendor_invoice_number": "777",
    "item_charges": [
      {
        "vendor_sku": "SKU888",
        "quantity": "10",
        "rate": "12.00",
        "amount": "120.00"
      },
      {
        "vendor_sku": "SKU888",
        "quantity": "20",
        "rate": "11.00",
        "amount": "220.00"
      }
    ]
  }
]

但是,我希望它看起来像这样:

[
  {
    "subsidary": "Natural Partners",
    "purchase_order_number": "AZ003387-PO",
    "invoice_date": "2018-10-15",
    "vendor_invoice_number": "76947",
    "item_charges": [
      {
        "vendor_sku": "SUP002",
        "quantity": "12.00",
        "rate": "14.50",
        "amount": "174.00"
      },
      {
        "vendor_sku": "SUP004",
        "quantity": "3.00",
        "rate": "8.75",
        "amount": "26.25"
      }
    ],
    "invoice_charges": 
    {
       "freight": '5.00',
       "taxes": '0.00',
    }
  }
]

我是否可以在python中执行此操作?

谢谢。

1 个答案:

答案 0 :(得分:1)

您可以通过在处理下一个嵌套之前存储每个嵌套来做到这一点。

df = pd.DataFrame({'subsidary': ['company name','company name'],
                   'purchase_order_number': ['PO Num', 'PO Num'],
                   'invoice_date': ['2018-10-15', '2018-10-15'],
                   'vendor_invoice_number': ['777','777'],
                   'vendor_sku': ['SKU888', 'SKU888'],
                   'quantity': ['10', '20'],
                   'rate': ['12.00', '11.00'],
                   'amount': ['120.00', '220.00'],
                   'freight': ['5.00', '5.00'],
                   'taxes': ['0.00', '0.00']})

# Your original procedure
j = df.groupby(
    ['subsidary','purchase_order_number','invoice_date',
    'vendor_invoice_number', "freight", "taxes"],
     as_index=False).apply(lambda x: x[['vendor_sku','quantity','rate','amount']].to_dict('r')
                     ).reset_index().rename(columns={0:'item_charges'})

# Store the item_charges and do it again      
item_charges = j["item_charges"]
j=j.groupby(['subsidary','purchase_order_number','invoice_date',
             'vendor_invoice_number',"freight", "taxes"], as_index=False
              ).apply(lambda x: x[["freight", "taxes"]].to_dict('r')
              ).reset_index().rename(columns={0:'invoice_charges'})

# Add back the stored item_charges
j["item_charges"] = item_charges
j = j.to_json(orient='records')
print(json.dumps(json.loads(j), indent=2, sort_keys=False))

我应该说,我对这种方法并不感到兴奋,也无法想象它是高效的,但这是我能想到的。它可以正常工作-输出如下:

[
  {
    "subsidary": "company name",
    "purchase_order_number": "PO Num",
    "invoice_date": "2018-10-15",
    "vendor_invoice_number": "777",
    "freight": "5.00",
    "taxes": "0.00",
    "invoice_charges": [
      {
        "freight": "5.00",
        "taxes": "0.00"
      }
    ],
    "item_charges": [
      {
        "vendor_sku": "SKU888",
        "quantity": "10",
        "rate": "12.00",
        "amount": "120.00"
      },
      {
        "vendor_sku": "SKU888",
        "quantity": "20",
        "rate": "11.00",
        "amount": "220.00"
      }
    ]
  }
]