我有一个像这样的excel文件:
name gender fac1(radio) fac2(tv) fac3(cycle) fac4(bike) hasCard cardNo
a1 f y y n y n
a2 m n n y n y AHJS5684
如何从上面的xls文件中获得如下所示的结构
"name": "a1",
"gender": "f",
"facilities": ["radio", "tv", "bike"],
"card": {
"exists": "n"
"cardNo": ""
}
到目前为止,我已经在代码中读取了excel文件:
import pandas as pd
#reading excel
df = pd.read_excel("C:\\Users\\Desktop\\Culture\\Artist_Data\\EZCC\\Madur.xlsx")
new_df = df.assign(facilities = df.filter(like = 'fac').apply(lambda x: x.str.lower().dropna().tolist(), axis=1))
d = df.to_dict('records')
上面的代码根本无法给出预期的结果。
答案 0 :(得分:1)
Pandas非常适合数据帧处理,而不是json格式。但是apply
可以将数据框的行(或列)转换为包括字典在内的任何内容,而list
可以将熊猫序列简单地转换为列表。
这意味着所需的转换可以是:
labels = {'fac1(radio)': 'radio', 'fac2(tv)': 'tv', 'fac3(cycle)': 'cycle',
'fac4(bike)': 'bike' }
d = list(df.fillna('').apply(lambda x: {
"name": x['name'],
"gender": x['gender'],
"facilities": [labels[i] for i in labels.keys() if x[i] == 'y'],
"card": {
"exists": x['hasCard'],
"cardNo": x['cardNo']
}}, axis=1))
您可以控制
print(json.dumps(d, indent=2))
给出预期的结果:
[
{
"name": "a1",
"gender": "f",
"facilities": [
"radio",
"tv",
"bike"
],
"card": {
"exists": "n",
"cardNo": ""
}
},
{
"name": "a2",
"gender": "m",
"facilities": [
"cycle"
],
"card": {
"exists": "y",
"cardNo": "AHJS5684"
}
}
]