Question

我在df中有一列，它嵌套json在这样的列表中：

col1     nested-filed
1        [{nested_data}]

嵌套字段中的数据如下：

[{'field': 1, 'timestamp': 1511404149332, 'changed-timestamp': 0, 'identities': [{'type': 'leadid', 'value': '123-456', 'timestamp': 1488815181110}, {'type': 'ID', 'value': '0987654321', 'timestamp': 1489691285116}, {'type': 'EMAIL', 'value': '1@1', 'timestamp': 1488815179334, 'is': True}]}]

我要在行中抽出email和ID，以便新的df如下所示：

col1     nested-filed          email           ID
1        [{nested_data}]       1@1.com         0987654321

我该怎么做？我需要在数据框中提取数百万行。

Answer 1

您可以尝试一下-

import ast
df.nested_filed = df.nested_filed.apply(lambda x: ast.literal_eval(x))

# Store in a new column named email
df['email'] = df.nested_filed.apply(lambda x: x[2]['value'])

# Store in a new column named ID
df['ID'] = df.nested_filed.apply(lambda x: x[1]['value'])

如何遍历另一列中的嵌套字段以基于其他值创建新列？

1 个答案: