我需要将NaN
中的所有NaT
和pandas.Series
替换为None
。
我尝试过:
def replaceMissing(ser):
return ser.where(pd.notna(ser), None)
但是它不起作用:
import pandas as pd
NaN = float('nan')
NaT = pd.NaT
floats1 = pd.Series((NaN, NaN, 2.71828, -2.71828))
floats2 = pd.Series((2.71828, -2.71828, 2.71828, -2.71828))
dates = pd.Series((NaT, NaT, pd.Timestamp("2019-07-09"), pd.Timestamp("2020-07-09")))
def replaceMissing(ser):
return ser.where(pd.notna(ser), None)
print(pd.__version__)
print(80*"-")
print(replaceMissing(dates))
print(80*"-")
print(replaceMissing(floats1))
print(80*"-")
print(replaceMissing(floats2))
您可以看到NaT
未被替换:
0.24.1
--------------------------------------------------------------------------------
0 NaT
1 NaT
2 2019-07-09
3 2020-07-09
dtype: datetime64[ns]
--------------------------------------------------------------------------------
0 None
1 None
2 2.71828
3 -2.71828
dtype: object
--------------------------------------------------------------------------------
0 2.71828
1 -2.71828
2 2.71828
3 -2.71828
dtype: float64
然后我尝试了这个额外的步骤:
def replaceMissing(ser):
ser = ser.where(pd.notna(ser), None)
return ser.replace({pd.NaT: None})
但是它仍然不起作用。由于某些原因,它会带回NaN
:
0.24.1
--------------------------------------------------------------------------------
0 None
1 None
2 2019-07-09 00:00:00
3 2020-07-09 00:00:00
dtype: object
--------------------------------------------------------------------------------
0 NaN
1 NaN
2 2.71828
3 -2.71828
dtype: float64
--------------------------------------------------------------------------------
0 2.71828
1 -2.71828
2 2.71828
3 -2.71828
dtype: float64
我还尝试将系列转换为object
:
def replaceMissing(ser):
return ser.astype("object").where(pd.notna(ser), None)
但是现在即使没有缺失值,最后一个系列也是object
:
0.24.1
--------------------------------------------------------------------------------
0 None
1 None
2 2019-07-09 00:00:00
3 2020-07-09 00:00:00
dtype: object
--------------------------------------------------------------------------------
0 None
1 None
2 2.71828
3 -2.71828
dtype: object
--------------------------------------------------------------------------------
0 2.71828
1 -2.71828
2 2.71828
3 -2.71828
dtype: object
我希望保留float64
。因此,我添加了infer_objects
:
def replaceMissing(ser):
return ser.astype("object").where(pd.notna(ser), None).infer_objects()
但它会再次带回NaN
:
0.24.1
--------------------------------------------------------------------------------
0 None
1 None
2 2019-07-09 00:00:00
3 2020-07-09 00:00:00
dtype: object
--------------------------------------------------------------------------------
0 NaN
1 NaN
2 2.71828
3 -2.71828
dtype: float64
--------------------------------------------------------------------------------
0 2.71828
1 -2.71828
2 2.71828
3 -2.71828
dtype: float64
我觉得必须有一个简单的方法来做到这一点。有人知道吗?
答案 0 :(得分:1)
对我来说,您的第二个解决方案的工作变更顺序已在 <EditText
...
android:inputType="none"/>
中进行了测试,但是0.24.2
变成了对象,因为混合类型-dtype
和None
或float
s:
timestamp
def replaceMissing(ser):
return ser.replace({pd.NaT: None}).where(pd.notna(ser), None)
print(pd.__version__)
print(80*"-")
print(replaceMissing(dates))
print(80*"-")
print(replaceMissing(dates).apply(type))
print(80*"-")
print(replaceMissing(floats1))
print(80*"-")
print(replaceMissing(floats1).apply(type))
print(80*"-")
print(replaceMissing(floats2))