使用null元素删除行会增加pandas数据帧中的频率

时间:2018-01-21 07:33:28

标签: python pandas dataframe

我有这段代码来删除那些频率非常低的值的行。它会删除频率较低的行,但也会增加剩余行的频率。

df = xl.parse("Tabelle1")
#removing places for which frequency is very less 
freq = df['bestSpot'].value_counts()
print(freq[:12])
to_remove = freq[freq<7].index
df['bestSpot'].replace(to_remove,None,inplace = True)
df = df.dropna()
freq = df['bestSpot'].value_counts()  
print(freq[:12])

输出:

Weissenstein    28
Fiesch          17
Fanas           15
Niesen          11
Brunni           8
Amisbühl         6
Balderen         4
Marbachegg       4
Cimetta          3
Lai Alv          3
Schwängimatt     2
Mornera          2
Name: bestSpot, dtype: int64
Weissenstein    33
Fiesch          28
Fanas           19
Niesen          13
Brunni          10
Name: bestSpot, dtype: int64

这不是不受欢迎的行为吗?有谁知道这个的原因?

1 个答案:

答案 0 :(得分:1)

我认为问题在于您将None作为新值传递。请尝试使用np.nan

...
to_remove = freq[freq<7].index
df['bestSpot'].replace(to_remove,np.nan ,inplace = True)
df = df.dropna()
...