Pandas-根据某些键值从数据框中提取值

时间:2020-04-17 07:57:30

标签: pandas

我有以下格式的数据框:

id, ref
101, [{'id': '74947', 'type': {'id': '104', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-A'}}]
102, [{'id': '74948', 'type': {'id': '105', 'name': 'Return', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-C'}}]
103, [{'id': '74949', 'type': {'id': '106', 'name': 'Sales', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-B'}}]

我正在尝试提取具有name = Sales的行并返回以下输出:

id, value
101, Prod-A 
103, Prod-B

1 个答案:

答案 0 :(得分:2)

str[0]用于Series.str.get的第一个列表,按字典键的值:

#if necessary convert list/dict repr to list/dict
import ast
df['ref'] = df['ref'].apply(ast.literal_eval)

df['names'] = df['ref'].str[0].str.get('type').str.get('name')
df['value'] = df['ref'].str[0].str.get('inwardIssue').str.get('key')

print (df)

    id                                                ref   names   value
0  101  [{'id': '74947', 'type': {'id': '104', 'name':...   Sales  Prod-A
1  102  [{'id': '74948', 'type': {'id': '105', 'name':...  Return  Prod-C
2  103  [{'id': '74949', 'type': {'id': '106', 'name':...   Sales  Prod-B

然后按boolean indexing进行过滤:

df1 = df.loc[df['names'].eq('Sales'), ['id','value']]
print (df1)
    id   value
0  101  Prod-A
2  103  Prod-B