我有以下 df 样本:
{'id_user': {0: -8884522802746938515,
1: -8884522802746938515,
2: -8884522802746938515},
'time': {0: '2021-01-01 11:10:34',
1: '2021-01-01 11:11:48',
2: '2021-01-01 11:12:38'},
'data': {0: '{"fat": 4, "type": "FOOD_GENERAL", "unit": "1 mug (8 fl oz)", "title": "Cappuccino", "amount": 1.0, "protein": 4, "calories": 74, "foodType": 4, "recipeId": 7350, "servings": 1.0, "timestamp": "1609499434205", "ingredient": true, "carbohydrates": 6, "nutrientsData": {"iron": 0.19, "fiber": 0.2, "sugar": 6.41, "sodium": 50.0, "calcium": 144.0, "protein": 4.08, "fatTotal": 3.98, "vitaminA": 34.0, "potassium": 233.0, "cholesterol": 12.0, "fatSaturated": 2.273, "carbohydrates": 5.81, "energyConsumed": 74.0, "fatMonounsaturated": 1.007, "fatPolyunsaturated": 0.241}}',
1: '{"fat": 1, "type": "FOOD_BRANDED", "unit": "1/2 cup prepared", "title": "Stove Top Stuffing Mix For Turkey (Kraft)", "amount": 1.0, "protein": 3, "calories": 110, "foodType": 5, "recipeId": 4072396, "servings": 1.0, "mealIndex": 2, "timestamp": "1609499508328", "ingredient": true, "carbohydrates": 21, "nutrientsData": {"iron": 1.3, "fiber": 1.0, "sugar": 2.0, "sodium": 370.0, "protein": 3.0, "fatTotal": 1.0, "potassium": 100.0, "carbohydrates": 21.0, "energyConsumed": 110.0}}',
2: '{"fat": 1, "type": "FOOD_BRANDED", "unit": "1/2 cup prepared", "title": "Stove Top Stuffing Mix For Turkey (Kraft)", "amount": 1.0, "protein": 3, "calories": 110, "foodType": 5, "recipeId": 4072396, "servings": 1.0, "timestamp": "1609499558606", "ingredient": true, "carbohydrates": 21, "nutrientsData": {"iron": 1.3, "fiber": 1.0, "sugar": 2.0, "sodium": 370.0, "protein": 3.0, "fatTotal": 1.0, "potassium": 100.0, "carbohydrates": 21.0, "energyConsumed": 110.0}}'}}
我正在对数据列执行以下操作:
pd.json_normalize(df.data.apply(json_loads))
结果我得到了我需要的东西,但我希望它粘在原来的 df 上。 我应该合并索引上的数据帧吗?是否有另一种方法可以在一行或一次完成?
答案 0 :(得分:1)
data
中的 df
列应先从 json 转换为 dict。
然后使用:
pd.json_normalize
df['data']
转换为数据帧,并合并到原点 df。df['data'] = df['data'].map(json.loads)
# method1
dfn = pd.json_normalize(df.to_dict(orient='records'))
# method2
obj = df['data']
dfn = df.merge(pd.DataFrame(obj.tolist(), index = obj.index),
left_index=True,
right_index=True)