Python将特定的数据帧列转换为整数

时间:2017-10-07 09:44:24

标签: python dataframe integer type-conversion

我有一个8列的数据帧,我想将最后六列转换为整数。数据框还包含NaN值,我不想删除它们。

enter image description here

      a      b    c     d     e     f     g    h
0   john     1   NaN   2.0    2.0  42.0  3.0  NaN
1   david    2  28.0  52.0   15.0  NaN   2.0  NaN
2   kevin    3   1.0   NaN   1.0   10.0  1.0  5.0

有什么想法吗?

谢谢。

2 个答案:

答案 0 :(得分:2)

感谢@AntonvBR for the downcast='integer' hint

In [29]: df.iloc[:, -6:] = df.iloc[:, -6:].apply(pd.to_numeric, errors='coerce', downcast='integer')

In [30]: df
Out[30]:
       a  b     c     d   e     f  g    h
0   john  1   NaN   2.0   2  42.0  3  NaN
1  david  2  28.0  52.0  15   NaN  2  NaN
2  kevin  3   1.0   NaN   1  10.0  1  5.0

In [31]: df.dtypes
Out[31]:
a     object
b      int64
c    float64
d    float64
e       int8
f    float64
g       int8
h    float64
dtype: object

答案 1 :(得分:2)

感谢MaxU我将这个选项添加到nan = -1:

  

原因:nan值是浮点值,不能与整数共存。   因此,无论是纳米值还是浮点数,或者选择将-1视为纳米

http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.to_numeric.html

import pandas as pd
import numpy as np

df = pd.DataFrame.from_dict({'a': {0: 'john', 1: 'david', 2: 'kevin'},
 'b': {0: 1, 1: 2, 2: 3},
 'c': {0: np.nan, 1: 28.0, 2: 1.0},
 'd': {0: 2.0, 1: 52.0, 2: np.nan},
 'e': {0: 2.0, 1: 15.0, 2: 1.0},
 'f': {0: 42.0, 1: np.nan, 2: 10.0},
 'g': {0: 3.0, 1: 2.0, 2: 1.0},
 'h': {0: np.nan, 1: np.nan, 2: 5.0}})

df.iloc[:, -6:] = df.iloc[:, -6:].fillna(-1)
df.iloc[:, -6:] = df.iloc[:, -6:].apply(pd.to_numeric, downcast='integer')

df

    a   b   c   d   e   f   g   h
0   john    1   -1  2   2   42  3   -1
1   david   2   28  52  15  -1  2   -1
2   kevin   3   1   -1  1   10  1   5