根据条件创建新列

时间:2019-12-04 19:17:59

标签: python pandas conditional-statements

我有一个数据框,其格式为:

 weekday           station   num_bikes  num_racks  hour
 no               Girwood   5           6         8
 yes              Girwood   6           5         12
 yes              Girwood   2           9         6
 no               Girwood   9           2         18
 yes              Fraser    0           14        16

我正在尝试根据小时和工作日列的值创建一个称为高峰时间的新列,我使用的代码是:

df.loc[(df['hour'] <7) , 'Rush_hour?'] = 'No'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'yes'), 'Rush hour?'] = 'Yes-am' 
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'no'), 'Rush hour?'] = 'No' 
df.loc[(df['hour']>10) & (df['hour'] <15) , 'Rush_hour?'] = 'No' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'yes'), 'Rush hour?'] = ' Yes-pm' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'no'), 'Rush hour?'] = ' No' 
df.loc[(df['hour']>18) , 'Rush_hour?'] = 'No' 

当我运行这段代码时,我得到了NaN,有人可以建议我的代码出了什么问题吗?

2 个答案:

答案 0 :(得分:1)

您的列命名不一致。

“高峰时间?”与“ Rush_hour?” 和 “工作日”与“工作日?”

尝试一下:

df=df.rename(columns={'weekday':'weekday?'})

df.loc[(df['hour'] <7) , 'Rush hour?'] = 'No'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'yes'), 'Rush hour?'] = 'Yes-am' 
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'no'), 'Rush hour?'] = 'No' 
df.loc[(df['hour']>10) & (df['hour'] <15) , 'Rush hour?'] = 'No' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'yes'), 'Rush hour?'] = ' Yes-pm' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'no'), 'Rush hour?'] = ' No' 
df.loc[(df['hour']>18) , 'Rush hour?'] = 'No' 

df

输出:

  weekday?  station  num_bikes  num_racks  hour Rush hour?
0       no  Girwood          5          6     8         No
1      yes  Girwood          6          5    12         No
2      yes  Girwood          2          9     6         No
3       no  Girwood          9          2    18         No
4      yes   Fraser          0         14    16     Yes-pm

假设您的逻辑正确。

答案 1 :(得分:1)

执行以下操作:

# initialize a list :
aList = []

# loop over all data and check whatever you want with if-elif-else :
for i in range(len(dff)):
    h = df['hour'][i]
    w = h = df['weekday?'][i]

    if(h < 7):
        aList .append('No')
    elif((h >= 7) & (h <= 10) & (w=='yes')):
        aList .append('Yes-am')
    else:
        aList .append('blah blah')
    # ....
# create a new columns and assign the list to it :
df['Rush hour'] = alist