Question

我是Python编程的新手，我正在尝试编写一个程序，该程序读取xlxs文件并将其转换为json。（我一直在使用Python 3和0.23.0版本的pandas，但是我遇到了一些问题和困难。）

==========================

我的xlsx文件中有两行，每行四列：

id     label        id_customer     label_customer

6     Sao Paulo      CUST-99992         Brazil

92    Hong Hong      CUST-88888         China

==========================

这是我的代码：

import pandas as pd
import json

file_imported = pd.read_excel('testing.xlsx', sheet_name = 'Plan1')

list1 = []
list  = []
for index, row in file_imported.iterrows():
    list.append ({
            "id"       : int(row['id']),
            "label"    : str(row['label']),
            "Customer" : list1
            })

    list1.append ({
           "id"       : str(row['id_customer']) ,
           "label"    : str(row['label_customer'])
           })

print (list)

with open ('testing.json', 'w') as f:
    json.dump(list, f, indent= True)

======================

Json输出：

[
 {
  "id": 6,
  "label": "Sao Paulo",
  "Customer": [
   {
    "id": "CUST-99992",
    "label": "Brazil"
   },
   {
    "id": "CUST-88888",
    "label": "China"
   }
  ]
 },
 {
  "id": 92,
  "label": "Hong Hong",
  "Customer": [
   {
    "id": "CUST-99992",
    "label": "Brazil"
   },
   {
    "id": "CUST-88888",
    "label": "China"
   }
  ]
 }
]

======================

预期结果：

[
 {
  "id": 6,
  "label": "Sao Paulo",
  "Customer": [
   {
    "id": "CUST-99992",
    "label": "Brazil"
   }
  ]
 },
 {
  "id": 92,
  "label": "Hong Hong",
  "Customer": [
   {
    "id": "CUST-88888",
    "label": "China"
   }
  ]
 }
]

======================

我已经尝试在将list1添加到列表中之前追加list1，但是并没有解决问题。

有人可以帮助我吗？

Answer 1

您的代码几乎可以正常工作-首先，您应该可能遇到的问题。避免为变量使用诸如“ list”之类的内置名称-其次，您将顺序向后移动。我改变了几行，最后想到：

file_imported = pd.read_excel('testing.xlsx')
print(file_imported)
list1 = []
for index, row in file_imported.iterrows():
    list1.append({
            "id"       : int(row['id']),
            "label"    : str(row['label']),
            "Customer" : [{'id':str(row['id_customer']),'label':str(row['label_customer'])}]
            })

print(list1)

with open('testing.json', 'w') as f:
    json.dump(list1, f, indent=True)

这似乎可以满足您的要求。顺便说一句，如果您愿意，也可以使用df.apply执行此操作。

Answer 2

问题在于，循环结束时您没有清空list1。这会导致list1在每个循环结束时增加大小。只需在每个循环的开头将list1清空即可获得所需的输出。请参见下面的代码（我将变量list更改为list_final，因为list是一种类型，并且您不应该拥有与类型同名的变量）

for index, row in df.iterrows():
    list1  = []
    list_final.append ({
            "id"       : int(row['id']),
            "label"    : str(row['label']),
            "Customer" : list1
            })

    list1.append ({
           "id"       : str(row['id_customer']) ,
           "label"    : str(row['label_customer'])
           })

现在输出就是您期望的：

print(list_final)
[
{'id': 6, 'label': 'Sao Paulo', 'Customer': [{'id': 'CUST-99992', 'label': 'Brazil'}]}, 
{'id': 92, 'label': 'Hong Kong', 'Customer': [{'id': 'CUST-88888', 'label': 'China'}]}
]

数据框对象转换为JSON时出现问题

2 个答案: