如何从数据框中删除异常值?

时间:2017-05-26 02:55:11

标签: python statistics data-science outliers

我有一个(268X4)df并找到了一列的异常值(22,1)。我想从df中删除那些异常值。我该怎么做?

> df=df_nonull import pandas as pd   # to manipulate dataframes import
> numpy as np   # to manipulate arrays
> 
> # a number "a" from the vector "x" is an outlier if 
> # a > median(x)+1.5*iqr(x) or a < median-1.5*iqr(x)
> # iqr: interquantile range = third interquantile - first interquantile def 
>outliers(x): 
>        return np.abs(x- x.median()) > 1.5*(x.quantile(.75)-
>x.quantile(0.25))
> 
> # Give the outliers for the first column for example 
>outliers=df.StockValue[outliers(df.StockValue)] 

1 个答案:

答案 0 :(得分:1)

你只能移除整行,不需要像(22,1)这样的单个单元格。如果要删除完整的数据行。

df = df.drop(df.index [[22]])