我有一个数据框,其中有两列我要转换为数字类型。我使用以下代码:
df[["GP","G"]]=df[["GP","G"]].apply(pd.to_numeric)
Python返回以下错误消息:
File "C:\Users\Alexandros_7\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4157, in _apply_standard
results[i] = func(v)
File "C:\Users\Alexandros_7\Anaconda3\lib\site-packages\pandas\tools\util.py", line 115, in to_numeric
coerce_numeric=coerce_numeric)
File "pandas\src\inference.pyx", line 612, in pandas.lib.maybe_convert_numeric (pandas\lib.c:53558)
File "pandas\src\inference.pyx", line 598, in pandas.lib.maybe_convert_numeric (pandas\lib.c:53344)
ValueError: ('Unable to parse string', 'occurred at index GP')
如何解决此问题?如何使用命令一次转换多个列类型?谢谢!
答案 0 :(得分:4)
只有将所有数据都解析为数字时,您的代码才有效。
如果没有,则数据框中至少有一个值不可转换为数字。在这种情况下,您可以根据自己的选择使用errors
参数。这是一个例子。
>>> df = pd.DataFrame({'A' : list('aabbcd'), 'B' : list('ffghhe')})
>>> df
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
>>> df.apply(pd.to_numeric, errors='ignore')
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
>>> df.apply(pd.to_numeric, errors='coerce')
A B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
>>> df.apply(pd.to_numeric, errors='raise')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 4042, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 4138, in _apply_standard
results[i] = func(v)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 4020, in f
return func(x, *args, **kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/tools/util.py", line 98, in to_numeric
coerce_numeric=coerce_numeric)
File "pandas/src/inference.pyx", line 612, in pandas.lib.maybe_convert_numeric (pandas/lib.c:53932)
File "pandas/src/inference.pyx", line 598, in pandas.lib.maybe_convert_numeric (pandas/lib.c:53719)
ValueError: ('Unable to parse string', u'occurred at index A')
>>>
以下是errors
错误:{'忽略','加注','强制'},默认'加注'
如果'raise',则无效的解析将引发异常
如果'强制',则无效解析将设置为NaN
如果'忽略',则无效的解析将返回输入