我有一个8列的数据帧,我想将最后六列转换为整数。数据框还包含NaN值,我不想删除它们。
a b c d e f g h
0 john 1 NaN 2.0 2.0 42.0 3.0 NaN
1 david 2 28.0 52.0 15.0 NaN 2.0 NaN
2 kevin 3 1.0 NaN 1.0 10.0 1.0 5.0
有什么想法吗?
谢谢。
答案 0 :(得分:2)
感谢@AntonvBR for the downcast='integer'
hint:
In [29]: df.iloc[:, -6:] = df.iloc[:, -6:].apply(pd.to_numeric, errors='coerce', downcast='integer')
In [30]: df
Out[30]:
a b c d e f g h
0 john 1 NaN 2.0 2 42.0 3 NaN
1 david 2 28.0 52.0 15 NaN 2 NaN
2 kevin 3 1.0 NaN 1 10.0 1 5.0
In [31]: df.dtypes
Out[31]:
a object
b int64
c float64
d float64
e int8
f float64
g int8
h float64
dtype: object
答案 1 :(得分:2)
感谢MaxU我将这个选项添加到nan = -1:
原因:nan值是浮点值,不能与整数共存。 因此,无论是纳米值还是浮点数,或者选择将-1视为纳米
http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.to_numeric.html
import pandas as pd
import numpy as np
df = pd.DataFrame.from_dict({'a': {0: 'john', 1: 'david', 2: 'kevin'},
'b': {0: 1, 1: 2, 2: 3},
'c': {0: np.nan, 1: 28.0, 2: 1.0},
'd': {0: 2.0, 1: 52.0, 2: np.nan},
'e': {0: 2.0, 1: 15.0, 2: 1.0},
'f': {0: 42.0, 1: np.nan, 2: 10.0},
'g': {0: 3.0, 1: 2.0, 2: 1.0},
'h': {0: np.nan, 1: np.nan, 2: 5.0}})
df.iloc[:, -6:] = df.iloc[:, -6:].fillna(-1)
df.iloc[:, -6:] = df.iloc[:, -6:].apply(pd.to_numeric, downcast='integer')
df
a b c d e f g h
0 john 1 -1 2 2 42 3 -1
1 david 2 28 52 15 -1 2 -1
2 kevin 3 1 -1 1 10 1 5