我试图寻找解决方案但我无法得到1.我在python中有一个api的以下输出。
insights = [ <Insights> {
"account_id": "1234",
"actions": [
{
"action_type": "add_to_cart",
"value": "8"
},
{
"action_type": "purchase",
"value": "2"
}
],
"cust_id": "xyz123",
"cust_name": "xyz",
}, <Insights> {
"account_id": "1234",
"cust_id": "pqr123",
"cust_name": "pqr",
}, <Insights> {
"account_id": "1234",
"actions": [
{
"action_type": "purchase",
"value": "45"
}
],
"cust_id": "abc123",
"cust_name": "abc",
}
]
我希望数据框像这样
- account_id add_to_cart purchase cust_id cust_name
- 1234 8 2 xyz123 xyz
- 1234 pqr123 pqr
- 1234 45 abc123 abc
当我使用以下
时> insights_1 = [x for x in insights]
> df = pd.DataFrame(insights_1)
我得到以下
- account_id actions cust_id cust_name
- 1234 [{'value': '8', 'action_type': 'add_to_cart'},{'value': '2', 'action_type': 'purchase'}] xyz123 xyz
- 1234 NaN pqr123 pqr
- 1234 [{'value': '45', 'action_type': 'purchase'}] abc123 abc
我该如何继续前进?
答案 0 :(得分:4)
这是一个解决方案。
df = pd.DataFrame(insights)
parts = [pd.DataFrame({d['action_type']: d['value'] for d in x}, index=[0])
if x == x else pd.DataFrame({'add_to_cart': [np.nan], 'purchase': [np.nan]})
for x in df['actions']]
df = df.drop('actions', 1)\
.join(pd.concat(parts, axis=0, ignore_index=True))
print(df)
account_id cust_id cust_name add_to_cart purchase
0 1234 xyz123 xyz 8 2
1 1234 pqr123 pqr NaN NaN
2 1234 abc123 abc NaN 45
<强>解释强>
pandas
将字典外部列表读入数据框。nan
值。说明 - 详情
详细介绍了parts
:
df['actions']
中的每个条目;每个条目都是列表
字典。for
循环中逐个迭代它们,即按行迭代。else
部分说&#34;如果是np.nan
[即然后返回nan
s&#34;的数据帧。 if
部分获取字典列表并为每行创建一个迷你数据框 。答案 1 :(得分:1)
我认为apply
使用df
将是一种选择。首先,我将NaN
替换为空列表:
df['actions'][df['actions'].isnull()] = df['actions'][df['actions'].isnull()].apply(lambda x: [])
如果类型为add_to_cart
并使用add_to_cart
创建列,则创建一个函数apply
来读取操作列表:
def add_to_cart(list_action):
for action in list_action:
# for each action, see if the key action_type has the value add_to_cart and return the value
if action['action_type'] == 'add_to_cart':
return action['value']
# if no add_to_cart action, then empty
return ''
df['add_to_cart'] = df['actions'].apply(add_to_cart)
purchase
的相同想法:
def purchase(list_action):
for action in list_action:
if action['action_type'] == 'purchase':
return action['value']
return ''
df['purchase'] = df['actions'].apply(purchase)
然后,如果您愿意,可以删除列actions
:
df = df.drop('actions',axis=1)
编辑:使用参数定义唯一函数find_action
然后apply
,例如:
def find_action(list_action, action_type):
for action in list_action:
# for each action, see if the key action_type is the one wanted
if action['action_type'] == action_type:
return action['value']
# if not the right action type found, then empty
return ''
df['add_to_cart'] = df['actions'].apply(find_action, args=(['add_to_cart']))
df['purchase'] = df['actions'].apply(find_action, args=(['purchase']))