我有一个熊猫数据框,如下所示:
df = pd.DataFrame({'x':['''[{"key":"Gender","value":["Men"]},
{"key":"Shoe Size","value":["M"]},
{"key":"Shoe Category","value":["Men's Shoes"]},
{"key":"Color","value":["Multicolor"]},
{"key":"Manufacturer Part Number","value":["8190-W-NAVY-7.5"]},
{"key":"Brand","value":["Josmo"]}]''',
'''[{"key":"Gender","value":["Women"]},
{"key":"Size","value":["XL"]},
{"key":"Heel Height","value":["1 Inches"]}]'''],
'y':['A','B']})
基本上是我希望提取到它们自己的列的键值对的列表,并且行之间的键不一致。
有什么建议吗?
答案 0 :(得分:1)
这是一个可能的解决方案。但是,您必须事先找出所有可能的键值。我想,可以通过编程方式完成,但是我在这里对其进行了硬编码。另外,如果值中有多个项目,它将采用第一个。
import pandas as pd
import json
# original dataframe
df = pd.DataFrame({'x':['''[{"key":"Gender","value":["Men"]},
{"key":"Shoe Size","value":["M"]},
{"key":"Shoe Category","value":["Men's Shoes"]},
{"key":"Color","value":["Multicolor"]},
{"key":"Manufacturer Part Number","value":["8190-W-NAVY-7.5"]},
{"key":"Brand","value":["Josmo"]}]''',
'''[{"key":"Gender","value":["Women"]},
{"key":"Shoe Size","value":["M"]},
{"key":"Shoe Category","value":["Women's Shoes"]},
{"key":"Color","value":["Multicolor"]},
{"key":"Manufacturer Part Number","value":["8190-W-NAVY-7.5"]}]'''],
'y':['A','B']})
expanded_columns = ['Gender', 'Shoe Size', 'Shoe Category', 'Color',
'Manufacturer Part Number', 'Brand']
# function to create list of values from json text
def json_to_cols(s):
l = json.loads(s)
d = {i:None for i in expanded_columns}
for row in l:
d[row['key']] = row['value'][0]
return list(d.values())
# Create new dataframe with expanded columns
df1 = df.apply(lambda row: pd.Series(json_to_cols(row['x']), index=expanded_columns),
axis=1)
new_df = df.join(df1)
print(new_df)
答案 1 :(得分:0)
尚不清楚您想要什么,但是以下代码将生成一个数据帧,其中的列名称取自y
,索引取自x
的键,以及值每列的值均取自x
中的值,NaN
用于未出现的任何键。
output_df = pd.DataFrame(
{input_row[1]['y']:
{
pair['key']: pair['value'][0]
for pair in ast.literal_eval(input_row[1]['x'])
}
for input_row in df.iterrows()
}
)
输出:
A B
Brand Josmo NaN
Color Multicolor NaN
Gender Men Women
Heel Height NaN 1 Inches
Manufacturer Part Number 8190-W-NAVY-7.5 NaN
Shoe Category Men's Shoes NaN
Shoe Size M NaN
Size NaN XL