数据框中的以下列需要转换为int:
dsAttendEnroll.District.head()
0 DISTRICT 01
1 DISTRICT 02
2 DISTRICT 03
3 DISTRICT 04
4 DISTRICT 05
Name: District, dtype: object
使用astype会出现以下错误,如何做到这一点?
dsAttendEnroll.District = dsAttendEnroll.District.map(lambda x: x[-2:]).astype(int)
ValueError:long()的基数为10的无效文字:'LS'
答案 0 :(得分:2)
您可以尝试:
dsAttendEnroll.District=pd.to_numeric(dsAttendEnroll.District)
dsAttendEnroll.District=dsAttendEnroll.District.astype(int)
查看文档here。
答案 1 :(得分:2)
您可以split
使用str[1]
和to_numeric
选择第二个列表,其中参数errors='coerce'
- 它不会将数值转换为NaN
:< / p>
print (df)
District
0 DISTRICT 01
1 DISTRICT 02
2 DISTRICT 03
3 DISTRICT 04
4 DISTRICT 05
5 DISTRICT LS
print (df.District.str.split().str[1])
0 01
1 02
2 03
3 04
4 05
5 LS
Name: District, dtype: object
print (pd.to_numeric(df.District.str.split().str[1], errors='coerce'))
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 NaN
Name: District, dtype: float64
另一个解决方案是切片2最后一个字符:
print (df.District.str[-2:])
0 01
1 02
2 03
3 04
4 05
5 LS
Name: District, dtype: object
print (pd.to_numeric(df.District.str[-2:], errors='coerce'))
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 NaN
Name: District, dtype: float64