我使用了两种不同的代码来解决这个问题: 1.我使用了dataframe内部的条件。 2.我尝试使用这些功能。
我得到syntaxerror: invalid syntax
。
我仍然是使用Pyton的初学者。
第一种方法:
<df['hours_week'] = ['less_than_40' if x < 40 'between_40_and_45' elif x > 40 and x <= 45 'between_40_and_60' elif x >45 and x <= 60 'between_60_and_80' elif x >60 and x <=80 else 'more_than_80' for x in df['hours_per_week']]>
第二种方法:
<def set_value(x):
for x in df['hours_per_week']:
if x < 40:
t == print " less_than_40"
elif (x > 40 and x <= 45):
t == print "between_40_and_45"
elif(x>45 and x <=60):
t == print "between_40_and_45"
elif(x>60 and x <= 80):
t == print "between_60_and_80"
else:
t == print "more_than_80"
return t
df['hours_week'] = df['hours_per_week'].apply(set_value,args=())
这是第一种方法的收获:
File "<ipython-input-36-e90a4b2f98cc>", line 1
df['hours_week'] = ['less_than_40' if x < 40 'between_40_and_45' elif x > 40 and x <= 45 'between_40_and_60' elif x >45 and x <= 60 'between_60_and_80' elif x >60 and x <=80 else 'more_than_80' for x in df['hours_per_week']]
^
SyntaxError: invalid syntax
使用第二种方法:
File "<ipython-input-44-0a5dc69b4a15>", line 4
t == print " less_than_40"
^
SyntaxError: invalid syntax
答案 0 :(得分:0)
在pandas
中,我们通常使用pd.cut
df['hours_week']=pd.cut(df['hours_per_week'],bins=[-np.inf,40,45,60,80,np.inf])
您还可以在此处添加标签,labels=['less_than_40','between_40_and_45'....]
答案 1 :(得分:0)
您也可以使用searchsorted:
bins = pd.Series([40, 45, 60, 80])
labels = ['less_than_40', 'between_40_and_45', 'between_45_and_60',
'between_60_and_80', 'more_than_80']
df['hours_week'] = df['hours_per_week'].map(lambda x: labels[bins.searchsorted(x)])
第一个标签实际上应该是“ less_than_or_equal_to_40”。