我有这段代码来删除那些频率非常低的值的行。它会删除频率较低的行,但也会增加剩余行的频率。
df = xl.parse("Tabelle1")
#removing places for which frequency is very less
freq = df['bestSpot'].value_counts()
print(freq[:12])
to_remove = freq[freq<7].index
df['bestSpot'].replace(to_remove,None,inplace = True)
df = df.dropna()
freq = df['bestSpot'].value_counts()
print(freq[:12])
输出:
Weissenstein 28
Fiesch 17
Fanas 15
Niesen 11
Brunni 8
Amisbühl 6
Balderen 4
Marbachegg 4
Cimetta 3
Lai Alv 3
Schwängimatt 2
Mornera 2
Name: bestSpot, dtype: int64
Weissenstein 33
Fiesch 28
Fanas 19
Niesen 13
Brunni 10
Name: bestSpot, dtype: int64
这不是不受欢迎的行为吗?有谁知道这个的原因?
答案 0 :(得分:1)
我认为问题在于您将None
作为新值传递。请尝试使用np.nan
:
...
to_remove = freq[freq<7].index
df['bestSpot'].replace(to_remove,np.nan ,inplace = True)
df = df.dropna()
...