Question

我的pandas DataFrame中有一个带有国家/地区名称的列。我想使用if-else条件在该列上应用不同的过滤器，并且必须使用这些条件在该DataFrame上添加一个新列。

当前数据框：-

Company Country
BV 	Denmark
BV 	Sweden
DC 	Norway
BV 	Germany
BV 	France
DC 	Croatia
BV 	Italy
DC 	Germany
BV 	Austria
BV 	Spain

我已经尝试过了，但是在此过程中，我不得不一次又一次地定义国家。

bookings_d2.loc [（bookings_d2.Country =='丹麦'）| （bookings_d2.Country =='挪威'），'国家'] = bookings_d2.Country

在R中，我目前正在使用if else条件，例如，我想在python中实现同样的功能。

R代码示例1： ifelse（bookings_d2 $ COUNTRY_NAME％in％c（'丹麦'，'德国'，'挪威'，'瑞典'，'法国'，'意大利'，'西班牙'，'德国'，'奥地利'，'荷兰'， '克罗地亚'，'比利时'）， as.character（bookings_d2 $ COUNTRY_NAME），“其他”）

R代码示例2： ifelse（bookings_d2 $ country％in％c（'Germany'）， ifelse（bookings_d2 $ BOOKING_BRAND％in％c（'BV'），'Germany_BV'，'Germany_DC'），bookings_d2 $ country）

预期的DataFrame：-

Company Country
BV 	Denmark
BV 	Sweden
DC 	Norway
BV 	Germany_BV
BV 	France
DC 	Croatia
BV 	Italy
DC 	Germany_DC
BV 	Others
BV 	Others

Answer 1

不确定您要实现的目标是什么，但是我想这与以下内容类似：

df=pd.DataFrame({'country':['Sweden','Spain','China','Japan'], 'continent':[None] * 4})

  country continent
0  Sweden      None
1   Spain      None
2   China      None
3   Japan      None


df.loc[(df.country=='Sweden') | ( df.country=='Spain'), 'continent'] = "Europe"
df.loc[(df.country=='China') | ( df.country=='Japan'), 'continent'] = "Asia"

  country continent
0  Sweden    Europe
1   Spain    Europe
2   China      Asia
3   Japan      Asia

您还可以像这样使用python列表理解：

df.continent=["Europe" if (x=="Sweden" or x=="Denmark") else "Other" for x in df.country]

Answer 2

您可以获取它：

country_others=['Poland','Switzerland']


df.loc[df['Country']=='Germany','Country']=df.loc[df['Country']=='Germany'].apply(lambda x: x+df['Company'])['Country']
df.loc[(df['Company']=='DC') &(df['Country'].isin(country_others)),'Country']='Others'

Answer 3

您可以使用：

例如1：将Series.isin与numpy.where或loc一起使用，但必须用~反转掩码：

#removed Austria, Spain
L = ['Denmark','Germany','Norway','Sweden','France','Italy',
     'Germany','Netherlands','Croatia','Belgium']

df['Country'] = np.where(df['Country'].isin(L), df['Country'], 'Others')

替代：

df.loc[~df['Country'].isin(L), 'Country'] ='Others'

例如，2：使用numpy.select或嵌套的np.where：

m1 = df['Country'] == 'Germany'
m2 = df['Company'] == 'BV'
df['Country'] = np.select([m1 & m2, m1 & ~m2],['Germany_BV','Germany_DC'], df['Country'])

替代：

df['Country'] = np.where(~m1, df['Country'],
                np.where(m2, 'Germany_BV','Germany_DC'))
print (df)
  Company     Country
0      BV     Denmark
1      BV      Sweden
2      DC      Norway
3      BV  Germany_BV
4      BV      France
5      DC     Croatia
6      BV       Italy
7      DC  Germany_DC
8      BV      Others
9      BV      Others

如果熊猫数据框中的条件，则如何应用IF，否则，否则，否则应用

3 个答案: