基本上,我正在使用熊猫读取xlsx文件并将其转换为json文件。我知道如何做,但我想我必须创建一个'if'语句来读取每一行,并找出与上一行不同的元素,然后将其附加到我的对象中。
我正在读取的数据:
id label id_customer label_customer part_number
6 Sao Paulo CUST-99992 Brazil 7897
6 Sao Paulo CUST-99992 Brazil 982
6 Sao Paulo CUST-43535 Brazil 435
92 Hong Hong CUST-88888 China 785
===============================
这是我的代码:
import pandas as pd
import json
file_imported = pd.read_excel('testing.xlsx', sheet_name = 'Plan1')
list_final = []
for index, row in file_imported.iterrows():
list1 = []
list_final.append ({
"id" : int(row['id']),
"label" : str(row['label']),
"Customer" : list1
})
list2 = []
list1.append ({
"id" : str(row['id_customer']) ,
"label" : str(row['label_customer']),
"number" : list2
})
list2.append({
"part" : str(row['part_number'])
})
print (list_final)
with open ('testing.json', 'w') as f:
json.dump(list_final, f, indent= True)
===============================
Json输出:
[
{
"id": 6,
"label": "Sao Paulo",
"Customer": [
{
"id": "CUST-99992",
"label": "Brazil",
"number" : [
{
"part": "7897"
}
]
}
]
},
{
"id": 6,
"label": "Sao Paulo",
"Customer": [
{
"id": "CUST-99992",
"label": "Brazil",
"number" : [
{
"part": "982"
}
]
}
]
},
{
"id": 6,
"label": "Sao Paulo",
"Customer": [
{
"id": "CUST-43535",
"label": "Brazil",
"number" : [
{
"part": "435"
}
]
}
]
},
{
"id": 92,
"label": "Hong Hong",
"Customer": [
{
"id": "CUST-88888",
"label": "China",
"number" : [
{
"part": "785"
}
]
}
]
}
]
===============================
,我需要这样的东西:
[
{
"id": 6,
"label": "Sao Paulo",
"Customer": [
{
"id": "CUST-99992",
"label": "Brazil",
"number" : [
{
"part": "7897"
},
{
"part": "982"
}
]
},
{
"id": "CUST-43535",
"label": "Brazil",
"number" : [
{
"part": "435"
}
]
}
]
},
{
"id": 92,
"label": "Hong Hong",
"Customer": [
{
"id": "CUST-88888",
"label": "China",
"number" : [
{
"part": "785"
}
]
}
]
}
]
====================
有人可以帮我吗?????
答案 0 :(得分:1)
查看所需的json,将其分为两组。第一个包含id
和label
字段,第二个包含id_customer
和label_customer
字段。最里面的数据是part_number
,可以使用列表理解和字典理解[{'part': str(p)} for p in df2['part_number']]
来创建。其余只是数据处理。
import json
result = []
for labels, df1 in df.groupby(['id', 'label']):
id_, label = labels
record = {'id': int(id_), 'label': label, 'Customer': []}
for inner_labels, df2 in df1.groupby(['id_customer', 'label_customer']):
id_, label = inner_labels
record['Customer'].append({
'id': id_,
'label': label,
'number': [{'part': str(p)} for p in df2['part_number']]
})
result.append(record)
>>> print(json.dumps(result, indent=True))
[
{
"id": 6,
"label": "Sao Paulo",
"Customer": [
{
"id": "CUST-43535",
"label": "Brazil",
"number": [
{
"part": "435"
}
]
},
{
"id": "CUST-99992",
"label": "Brazil",
"number": [
{
"part": "7897"
},
{
"part": "982"
}
]
}
]
},
{
"id": 92,
"label": "Hong Kong",
"Customer": [
{
"id": "CUST-88888",
"label": "China",
"number": [
{
"part": "785"
}
]
}
]
}
]