删除高于阈值的值

时间:2018-10-29 12:06:17

标签: python pandas

我正在尝试从数据帧(温度为温度)中删除一些值,分别是10.0、10.5、40.0,但是我想删除的值像140.0、159.5 ..一样。 我使用以下功能,但没有删除任何东西,如索引

def remove_outlier(df, col_name):
    threshold = 100.0  # Anything that occurs abovethan this will be removed.
    value_counts = df.stack().value_counts()  # Entire DataFrame
    to_remove = value_counts[value_counts >= threshold].index
    if(len(to_remove) > 0):
        df[col_name].replace(to_remove, np.nan)
    return df

3 个答案:

答案 0 :(得分:6)

尝试

df = df[df[col_name] < threshold]

df = df[~ df[col_name] > threshold]

答案 1 :(得分:1)

扩展GRS's answer

>>> import pandas as pd

>>> d

  City  Temperature
0    A         10.0
1    B         10.5
2    C        140.0
3    D         30.0
4    E        145.0
5    F         99.0


>>> def remove_outlier(dataFrame, col_name='Temperature', threshold=100):
...     return dataFrame[dataFrame[col_name] < threshold]

>>> remove_outlier(d)

  City  Temperature
0    A         10.0
1    B         10.5
3    D         30.0
5    F         99.0

答案 2 :(得分:0)

您还可以使用query的{​​{1}}功能:

pandas