如果熊猫数据框中的条件,则如何应用IF,否则,否则,否则应用

时间:2019-08-23 04:10:30

标签: python-3.x pandas numpy dataframe if-statement

我的pandas DataFrame中有一个带有国家/地区名称的列。我想使用if-else条件在该列上应用不同的过滤器,并且必须使用这些条件在该DataFrame上添加一个新列。

当前数据框:-

Company Country
BV 	Denmark
BV 	Sweden
DC 	Norway
BV 	Germany
BV 	France
DC 	Croatia
BV 	Italy
DC 	Germany
BV 	Austria
BV 	Spain

我已经尝试过了,但是在此过程中,我不得不一次又一次地定义国家。

bookings_d2.loc [(bookings_d2.Country =='丹麦')| (bookings_d2.Country =='挪威'),'国家'] = bookings_d2.Country

在R中,我目前正在使用if else条件,例如,我想在python中实现同样的功能。

R代码示例1: ifelse(bookings_d2 $ COUNTRY_NAME%in%c('丹麦','德国','挪威','瑞典','法国','意大利','西班牙','德国','奥地利','荷兰', '克罗地亚','比利时'),                               as.character(bookings_d2 $ COUNTRY_NAME),“其他”)

R代码示例2: ifelse(bookings_d2 $ country%in%c('Germany'),                  ifelse(bookings_d2 $ BOOKING_BRAND%in%c('BV'),'Germany_BV','Germany_DC'),bookings_d2 $ country)

预期的DataFrame:-

Company Country
BV 	Denmark
BV 	Sweden
DC 	Norway
BV 	Germany_BV
BV 	France
DC 	Croatia
BV 	Italy
DC 	Germany_DC
BV 	Others
BV 	Others

3 个答案:

答案 0 :(得分:2)

不确定您要实现的目标是什么,但是我想这与以下内容类似:

df=pd.DataFrame({'country':['Sweden','Spain','China','Japan'], 'continent':[None] * 4})

  country continent
0  Sweden      None
1   Spain      None
2   China      None
3   Japan      None


df.loc[(df.country=='Sweden') | ( df.country=='Spain'), 'continent'] = "Europe"
df.loc[(df.country=='China') | ( df.country=='Japan'), 'continent'] = "Asia"

  country continent
0  Sweden    Europe
1   Spain    Europe
2   China      Asia
3   Japan      Asia

您还可以像这样使用python列表理解:

df.continent=["Europe" if (x=="Sweden" or x=="Denmark") else "Other" for x in df.country]

答案 1 :(得分:1)

您可以获取它:

country_others=['Poland','Switzerland']


df.loc[df['Country']=='Germany','Country']=df.loc[df['Country']=='Germany'].apply(lambda x: x+df['Company'])['Country']
df.loc[(df['Company']=='DC') &(df['Country'].isin(country_others)),'Country']='Others'

答案 2 :(得分:1)

您可以使用:

例如1:将Series.isinnumpy.whereloc一起使用,但必须用~反转掩码:

#removed Austria, Spain
L = ['Denmark','Germany','Norway','Sweden','France','Italy',
     'Germany','Netherlands','Croatia','Belgium']

df['Country'] = np.where(df['Country'].isin(L), df['Country'], 'Others')

替代:

df.loc[~df['Country'].isin(L), 'Country'] ='Others'

例如,2:使用numpy.select或嵌套的np.where

m1 = df['Country'] == 'Germany'
m2 = df['Company'] == 'BV'
df['Country'] = np.select([m1 & m2, m1 & ~m2],['Germany_BV','Germany_DC'], df['Country'])

替代:

df['Country'] = np.where(~m1, df['Country'],
                np.where(m2, 'Germany_BV','Germany_DC'))
print (df)
  Company     Country
0      BV     Denmark
1      BV      Sweden
2      DC      Norway
3      BV  Germany_BV
4      BV      France
5      DC     Croatia
6      BV       Italy
7      DC  Germany_DC
8      BV      Others
9      BV      Others