我正在尝试将表转换为JSON记录的最佳方法。目前我有所需的输出,但桌子的格式让我感到困惑。下面的例子应该解释:
ID Product Item_Material Owner Interest %
123 Test Item 1 Electric Elctrotech 60%
null null null Spark inc 40%
124 Test Item 2 Wood TY Toys 100%
125 Test Item 3 Plastic NA Materials 100%
我的新行JSON是我想要的,但我希望以某种方式将嵌套的表行实现为嵌套的JSON格式,如果是父行的一部分。
{"ID":"Test Item 1", "Item_Material":"Electric", "Owner":"Elctrotech","Interest %":"60%"}
{"ID":null, "Item_Material":null, "Owner":"Spark inc","Insterest %":"40%"}
{"ID":"Test Item 2", "Item_Material":"Wood", "Owner":"TY Toys","Insterest %":"100%"}
{"ID":"Test Item 3","Item_Material":"Plastic","Owner":"NA Materials","Interest %":"100%"}
目标是让第一行JSON像这样吗?
{"ID":"Test Item 1", "Item_Material":"Electric", "Owners": [{"Owner": "Elctrotech", "Interest %":"60%", "Owner":"Spark inc","Interest %":"40%"}]}
数据源自使用Beautiful Soup的刮表,我提供的表中的行都在单独的<tr>
标记中,因此当拉入pandas数据帧时,它会以这种方式呈现。我不知道是否有功能甚至将pandas合并到上面的行中,因此我可以在每个&#39;产品中有一个JSON记录。有时会有多个“拥有者”。每件商品不仅仅是2件。
答案 0 :(得分:0)
输出dict行与你预期的不一样,但你的dict sintax错了。试试这个。只有Pandas
p=[[123,"Test Item 1","Electric","Elctrotech","60%"], [124,"Test Item 2","Wood"," TY Toys","100%"],[125,"Test Item 1","Plastic","NA Materials","100%"], [123,"Test Item 1","Foo","Bar","80%"], [123,"Test Item 1","Electric","TRY TRY TRY","70%"]]
x=pd.DataFrame(p, columns=["ID","Product","Item_Material","Owner","Interest %"])
d=dict(ID="", Item_Material="", Owners={"Owner":[], "Interest %":[]})
x_gb=x.groupby(["Product", "Item_Material"])
grouped_Series_Owner = x_gb["Owner"].apply(list).to_dict()
grouped_Series_Interest = x_gb["Interest %"].apply(list).to_dict()
for k in out.keys():
d["Item_Material"]=out[k]["Item_Material"]
d["ID"]=out[k]["Product"]
d["Owners"]["Owner"]= grouped_Series_Owner[(out[k]["Product"], out[k]["Item_Material"])]
d["Owners"]["Interest %"]= grouped_Series_Interest[(out[k]["Product"], out[k]["Item_Material"])]
print(d)