我设法找到了我之前遇到的问题的答案(在此处找到:How can i create a ruleset to assign values to specific columns, based on searching substrings, in Pandas?)。
但是我想知道是否有更有效的方法来做到这一点。我想根据我在描述栏中搜索的字符串创建多个分类列。
目前我的策略如下:
android_phones = ['samsung','xperia','google']
iphone= ['iphone','apple']
def OS_rules(raw_Df):
val=''
if any(word in raw_Df['Names'].lower() for word in android_phones):
val='android'
elif any(word in raw_Df['Names'].lower() for word in iphone):
val='iPhone'
else: val = 'Handset'
return val
df.loc[:,'OS_Type']=df.apply(OS_rules,axis=1)
然而,通过这种策略,我需要创建多个功能,而且几乎可以使用#39;相同的规则,但返回值不同。
有没有办法从单个函数返回多个值?并将它们应用于多个新列?
e.g。
if any(word in raw_Df['Names'].lower() for word in android_phones):
val1='android'
val2='pixel'
val3='vodafone'
等等等,然后从那些创建新列?
答案 0 :(得分:0)
使用:
#create dictionary of all lists
d = {'android':android_phones, 'iPhone':iphone}
def OS_rules(raw_Df):
#loop by dictionary and return key of dict
for k, v in d.items():
if any(word in raw_Df['Names'].lower() for word in v):
return k
#if no value match get NaN, so fillna by default value
df['OS_Type']=df.apply(OS_rules,axis=1).fillna('Handset')
print (df)
Names qty OS_Type
0 IPHONE_3UK_CONTRACT 968 iPhone
1 IPHONE_O2_SIMONLY 155 iPhone
2 ANDROID_3UK_PAYG 77 Handset
3 ANDROID_VODAF_CONTRACT 973 Handset