我发现this recipe在我的数据框中保留了有限的条目。
公式为:
df[df == np.Inf] = np.NaN
df.dropna()
然而,当我尝试时:
In: df[df == np.Inf] = np.NaN
## -- End pasted text --
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-88eed8630e79> in <module>()
----> 1 df[df == np.Inf] = np.NaN
/Users/josh/anaconda/envs/py27/lib/python2.7/site-packages/pandas/core/frame.pyc in __setitem__(self, key, value)
/Users/josh/anaconda/envs/py27/lib/python2.7/site-packages/pandas/core/frame.pyc in _setitem_frame(self, key, value)
TypeError: Cannot do boolean setting on mixed-type frame
是否有更好的方法来过滤行,以便我们只在特定列中保留有限条目?
答案 0 :(得分:2)
使用np.isinf()
x = pandas.DataFrame([
[1, 2, np.inf],
[4, np.inf, 5],
[6, 7, 8]
])
x[np.isinf(x)] = np.nan
print(x)
0 1 2
0 1 2 NaN
1 4 NaN 5
2 6 7 8
所以x.dropna()
给了我:
0 1 2
2 6 7 8
要仅查看列的子集,请使用subset
kwarg(始终列出一个列表):
x.dropna(subset=[1])
0 1 2
0 1 2 NaN
2 6 7 8
您还可以获取DSM的建议并直接索引数据框:
x[~np.isinf(x).any(axis=1)]
答案 1 :(得分:2)
根据建议here,您可以使用mode.use_inf_as_null
:
In [14]: df = DataFrame({'a': randint(3,size=10)})
In [15]: df['b'] = tm.choice([2,3,nan,inf,-inf], size=len(df))
In [16]: df
Out[16]:
a b
0 1 inf
1 2 -inf
2 0 3.0000
3 1 -inf
4 2 NaN
5 1 3.0000
6 1 inf
7 0 2.0000
8 2 -inf
9 2 inf
In [17]: with pd.option_context('mode.use_inf_as_null', True):
....: res = df.dropna()
....:
In [18]: res
Out[18]:
a b
2 0 3
5 1 3
7 0 2