我有以下格式的数据框:
id, ref
101, [{'id': '74947', 'type': {'id': '104', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-A'}}]
102, [{'id': '74948', 'type': {'id': '105', 'name': 'Return', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-C'}}]
103, [{'id': '74949', 'type': {'id': '106', 'name': 'Sales', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-B'}}]
我正在尝试提取具有name = Sales
的行并返回以下输出:
id, value
101, Prod-A
103, Prod-B
答案 0 :(得分:2)
将str[0]
用于Series.str.get
的第一个列表,按字典键的值:
#if necessary convert list/dict repr to list/dict
import ast
df['ref'] = df['ref'].apply(ast.literal_eval)
df['names'] = df['ref'].str[0].str.get('type').str.get('name')
df['value'] = df['ref'].str[0].str.get('inwardIssue').str.get('key')
print (df)
id ref names value
0 101 [{'id': '74947', 'type': {'id': '104', 'name':... Sales Prod-A
1 102 [{'id': '74948', 'type': {'id': '105', 'name':... Return Prod-C
2 103 [{'id': '74949', 'type': {'id': '106', 'name':... Sales Prod-B
然后按boolean indexing
进行过滤:
df1 = df.loc[df['names'].eq('Sales'), ['id','value']]
print (df1)
id value
0 101 Prod-A
2 103 Prod-B