如何根据特定条件替换Pandas Dataframe中特定列的特定值?

时间:2019-03-23 18:12:56

标签: python-3.x pandas dataframe

我有一个Pandas数据框,其中包含学生和他们获得的分数的百分比。有些学生的分数显示超过100%。显然这些值是不正确的,我想用NaN替换所有大于100%的百分比值。

我已经尝试了一些代码,但不能完全得到我想要的东西。

import numpy as np
import pandas as pd

new_DF = pd.DataFrame({'Student' : ['S1', 'S2', 'S3', 'S4', 'S5'],
                       'Percentages' : [85, 70, 101, 55, 120]})

#  Percentages  Student
#0          85       S1
#1          70       S2
#2         101       S3
#3          55       S4
#4         120       S5

new_DF[(new_DF.iloc[:, 0] > 100)] = np.NaN

#  Percentages  Student
#0        85.0       S1
#1        70.0       S2
#2         NaN      NaN
#3        55.0       S4
#4         NaN      NaN

您可以看到该代码的工作原理,但实际上它用NaN替换了该百分比大于100的特定行中的所有值。我只想用大于100的NaN替换“百分比”列中的值。有什么办法吗?

4 个答案:

答案 0 :(得分:3)

尝试使用np.where

new_DF.Percentages=np.where(new_DF.Percentages.gt(100),np.nan,new_DF.Percentages)

new_DF.loc[new_DF.Percentages.gt(100),'Percentages']=np.nan

print(new_DF)

  Student  Percentages
0      S1         85.0
1      S2         70.0
2      S3          NaN
3      S4         55.0
4      S5          NaN

答案 1 :(得分:2)

df.Percentages = df.Percentages.apply(lambda x: np.nan if x>100 else x)

df.Percentages = df.Percentages.where(df.Percentages<100, np.nan)

答案 2 :(得分:1)

您可以使用.loc

new_DF.loc[new_DF['Percentages']>100, 'Percentages'] = np.NaN

输出:

  Student  Percentages
0      S1         85.0
1      S2         70.0
2      S3          NaN
3      S4         55.0
4      S5          NaN

答案 3 :(得分:0)

import numpy as np
import pandas as pd

new_DF = pd.DataFrame({'Student' : ['S1', 'S2', 'S3', 'S4', 'S5'],
                      'Percentages' : [85, 70, 101, 55, 120]})
#print(new_DF['Student'])
index=-1
for i in new_DF['Percentages']:
    index+=1
    if i > 100:
        new_DF['Percentages'][index] = "nan"




print(new_DF)