Question

我尝试在可能的情况下将列转换为数值（int）。这是一个示例：

    >>>s = pd.Series(["8", 6, "7.5", 3, "somestring"])
    >>>s
    0      8
    1      6
    2    7.5
    3      3
    4    somestring
    dtype: object

文档（https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_numeric.html）为我提供了以下选择：

    >>> pd.to_numeric(s, errors='coerce')
    0     8.0
    1     6.0
    2     7.5
    3     3.0
    4    NaN
    dtype: float64

我希望得到的输出是：

   0     8.0
   1     6.0
   2     7.5
   3     3.0
   4    somestring

因此，基本上它应该忽略非数字值，但转换其他所有值。如果我使用errors ='ignore'选项，则保持不变。我当时正在考虑为所有数值编制索引，但无法确定解决方案。谢谢！

Answer 1

不建议使用，因为会再次获取带有数字的混合字符串，但是可以使用combine_first或fillna：

s1 = pd.to_numeric(s, errors='coerce').combine_first(s)
#alternative solution
#s1 = pd.to_numeric(s, errors='coerce').fillna(s)
print (s1)
0             8
1             6
2           7.5
3             3
4    somestring
dtype: object

print (s1.apply(type))
0    <class 'float'>
1    <class 'float'>
2    <class 'float'>
3    <class 'float'>
4      <class 'str'>
dtype: object

您是正确的，ignore参数不起作用：

print (pd.to_numeric(s, errors='ignore').apply(type))
0    <class 'str'>
1    <class 'int'>
2    <class 'str'>
3    <class 'int'>
4    <class 'str'>
dtype: object

Answer 2

`pd.to_numeric` + `update`

您可以使用数值更新您的系列：

s = pd.Series(["8", 6, "7.5", 3, "somestring"])
s.update(pd.to_numeric(s, errors='coerce'))

print(s.apply(type))

0    <class 'float'>
1    <class 'float'>
2    <class 'float'>
3    <class 'float'>
4      <class 'str'>
dtype: object

pd.to_numeric忽略非数字值

2 个答案:

`pd.to_numeric` + `update`

pd.to_numeric忽略非数字值

2 个答案:

pd.to_numeric + update

`pd.to_numeric` + `update`