我有一个词典存储,其中包含网点名称作为键,并包含网点种类作为其值
stores = {'McDonalds':'Fast food','African and Eastern Beverage':'Alcohol','Baskin Robbins': 'ice Cream'}
我有一个熊猫数据框,其中包含出口名称以及分支位置(在“商家列”中),其中一些具有MerchantType的NaN值
我需要使用商店字典填充具有NaN值并与商家类型匹配的那些行的MerchantType列,请记住商店字典不包含位置(阿布扎比,迪拜,华盛顿等)在键中。
print (df)
Merchant MerchantType
index
0 McDonalds Abu Dhabi NaN
1 Dunkin Donuts Sweets
2 African and Eastern Beverage Dubai NaN
3 Burger King Fast Food
4 African and Eastern Beverage Abu Dhabi NaN
5 Burger King Fast Food
6 Baskin Robbins Washington NaN
最有效的方法是什么?
答案 0 :(得分:1)
在字典中循环并仅通过str.contains
设置匹配的值,并在MerchantType
列中设置缺少的值:
mask = df['MerchantType'].isnull()
for k, v in stores.items():
df.loc[df['Merchant'].str.contains(k, regex=False) & mask, 'MerchantType'] = v
print (df)
Merchant MerchantType
index
0 McDonalds Abu Dhabi Fast food
1 Dunkin Donuts Sweets
2 African and Eastern Beverage Dubai Alcohol
3 Burger King Fast Food
4 African and Eastern Beverage Abu Dhabi Alcohol
5 Burger King Fast Food
6 Baskin Robbins Washington ice Cream