无法将pandas列从string转换为int

时间:2016-09-26 17:34:43

标签: string python-2.7 pandas int

数据框中的以下列需要转换为int:

dsAttendEnroll.District.head()

0    DISTRICT 01
1    DISTRICT 02
2    DISTRICT 03
3    DISTRICT 04
4    DISTRICT 05
Name: District, dtype: object

使用astype会出现以下错误,如何做到这一点?

dsAttendEnroll.District = dsAttendEnroll.District.map(lambda x: x[-2:]).astype(int)
  

ValueError:long()的基数为10的无效文字:'LS'

2 个答案:

答案 0 :(得分:2)

您可以尝试:

dsAttendEnroll.District=pd.to_numeric(dsAttendEnroll.District)
dsAttendEnroll.District=dsAttendEnroll.District.astype(int)

查看文档here

答案 1 :(得分:2)

您可以split使用str[1]to_numeric选择第二个列表,其中参数errors='coerce' - 它不会将数值转换为NaN:< / p>

print (df)
      District
0  DISTRICT 01
1  DISTRICT 02
2  DISTRICT 03
3  DISTRICT 04
4  DISTRICT 05
5  DISTRICT LS

print (df.District.str.split().str[1])
0    01
1    02
2    03
3    04
4    05
5    LS
Name: District, dtype: object

print (pd.to_numeric(df.District.str.split().str[1], errors='coerce'))
0    1.0
1    2.0
2    3.0
3    4.0
4    5.0
5    NaN
Name: District, dtype: float64

另一个解决方案是切片2最后一个字符:

print (df.District.str[-2:])
0    01
1    02
2    03
3    04
4    05
5    LS
Name: District, dtype: object

print (pd.to_numeric(df.District.str[-2:], errors='coerce'))
0    1.0
1    2.0
2    3.0
3    4.0
4    5.0
5    NaN
Name: District, dtype: float64