我有一个数据框,其格式为:
weekday station num_bikes num_racks hour
no Girwood 5 6 8
yes Girwood 6 5 12
yes Girwood 2 9 6
no Girwood 9 2 18
yes Fraser 0 14 16
我正在尝试根据小时和工作日列的值创建一个称为高峰时间的新列,我使用的代码是:
df.loc[(df['hour'] <7) , 'Rush_hour?'] = 'No'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'yes'), 'Rush hour?'] = 'Yes-am'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'no'), 'Rush hour?'] = 'No'
df.loc[(df['hour']>10) & (df['hour'] <15) , 'Rush_hour?'] = 'No'
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'yes'), 'Rush hour?'] = ' Yes-pm'
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'no'), 'Rush hour?'] = ' No'
df.loc[(df['hour']>18) , 'Rush_hour?'] = 'No'
当我运行这段代码时,我得到了NaN,有人可以建议我的代码出了什么问题吗?
答案 0 :(得分:1)
您的列命名不一致。
“高峰时间?”与“ Rush_hour?” 和 “工作日”与“工作日?”
尝试一下:
df=df.rename(columns={'weekday':'weekday?'})
df.loc[(df['hour'] <7) , 'Rush hour?'] = 'No'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'yes'), 'Rush hour?'] = 'Yes-am'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'no'), 'Rush hour?'] = 'No'
df.loc[(df['hour']>10) & (df['hour'] <15) , 'Rush hour?'] = 'No'
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'yes'), 'Rush hour?'] = ' Yes-pm'
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'no'), 'Rush hour?'] = ' No'
df.loc[(df['hour']>18) , 'Rush hour?'] = 'No'
df
输出:
weekday? station num_bikes num_racks hour Rush hour?
0 no Girwood 5 6 8 No
1 yes Girwood 6 5 12 No
2 yes Girwood 2 9 6 No
3 no Girwood 9 2 18 No
4 yes Fraser 0 14 16 Yes-pm
假设您的逻辑正确。
答案 1 :(得分:1)
执行以下操作:
# initialize a list :
aList = []
# loop over all data and check whatever you want with if-elif-else :
for i in range(len(dff)):
h = df['hour'][i]
w = h = df['weekday?'][i]
if(h < 7):
aList .append('No')
elif((h >= 7) & (h <= 10) & (w=='yes')):
aList .append('Yes-am')
else:
aList .append('blah blah')
# ....
# create a new columns and assign the list to it :
df['Rush hour'] = alist