DF中2列获得3列的条件

时间:2019-10-27 10:58:47

标签: python pandas dataframe if-statement

我想创建一个IF条件以在新列('new_col')中设置值。一般想法是:

如果“得分” = np.nan和“年份” = 2012:返回1

elif'Score'== np.nan&'Year'= 2013:返回2

否则:返回“分数”

data = {'year': [2010, 2011, 2012, 2013, 2014], 'Score': [10, 15, np.nan, np.nan, 3]}
df = pd.DataFrame(data, columns = ['year', 'Score'])



  year  Score
0  2010   10.0
1  2011   15.0
2  2012    1.0
3  2013    2.0
4  2014    3.0

2 个答案:

答案 0 :(得分:0)

首先需要使用Series.isna来测试缺失值,然后可以通过Series.eq来比较==,并可以通过numpy.select来设置值:

m1 = df['Score'].isna() & df['year'].eq(2012)
m2 = df['Score'].isna() & df['year'].eq(2013)

df['Score'] = np.select([m1, m2], [1,2], default=df['Score'])
print (df)
   year  Score
0  2010   10.0
1  2011   15.0
2  2012    1.0
3  2013    2.0
4  2014    3.0

对于新列,请使用:

df['new_col'] = np.select([m1, m2], [1,2], default=df['Score'])
print (df)
   year  Score  new_col
0  2010   10.0     10.0
1  2011   15.0     15.0
2  2012    NaN      1.0
3  2013    NaN      2.0
4  2014    3.0      3.0

答案 1 :(得分:0)

使用np.selectSeries.isnull()

condition_1 = (df['Score'].isnull()) & (df['year'] == 2012)
condition_2 = (df['Score'].isnull()) & (df['year'] == 2013)
values = [1, 2]

df['new_col'] = np.select([condition_1, condition_2], values, df['Score'])

np.select的语法为:numpy.select(condition_list, choice_list, default_value)

df

    year    Score   new_col
0   2010    10.0    10.0
1   2011    15.0    15.0
2   2012    NaN     1.0
3   2013    NaN     2.0
4   2014    3.0     3.0