我正在尝试从数据帧(温度为温度)中删除一些值,分别是10.0、10.5、40.0,但是我想删除的值像140.0、159.5 ..一样。 我使用以下功能,但没有删除任何东西,如索引
def remove_outlier(df, col_name):
threshold = 100.0 # Anything that occurs abovethan this will be removed.
value_counts = df.stack().value_counts() # Entire DataFrame
to_remove = value_counts[value_counts >= threshold].index
if(len(to_remove) > 0):
df[col_name].replace(to_remove, np.nan)
return df
答案 0 :(得分:6)
尝试
df = df[df[col_name] < threshold]
或
df = df[~ df[col_name] > threshold]
答案 1 :(得分:1)
>>> import pandas as pd
>>> d
City Temperature
0 A 10.0
1 B 10.5
2 C 140.0
3 D 30.0
4 E 145.0
5 F 99.0
>>> def remove_outlier(dataFrame, col_name='Temperature', threshold=100):
... return dataFrame[dataFrame[col_name] < threshold]
>>> remove_outlier(d)
City Temperature
0 A 10.0
1 B 10.5
3 D 30.0
5 F 99.0
答案 2 :(得分:0)
您还可以使用query
的{{1}}功能:
pandas