Question

有一个包含几列的csv文件，有些列混有字母和数字。需要删除字母并将其设置为null并将列更改为整数，但是出现了一些错误。熊猫似乎最近添加了可为空的整数类型。 https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html。但是在更改为int时仍然会出错。我需要将列保持为int，因此我无法使用其他方法来将列设置为以NAN在列中浮动。数据如下：

 id    count      volume   
 001,     A   ,       1
 002,     1   ,       2

列数和卷中的值包括：'1'，'2'，'A'，.....

我使用re模块删除了字母和空格

df["count"] = df["count"].apply(lambda x: re.sub(r'\s[a-zA-Z]*', '',x))

现在列中的值看起来像：'1'，'2'，''，.......

试图更改为“ Int64”，但出现错误：

  df["count"].astype(str).astype('Int64')

TypeError：对象无法转换为IntegerDtype

有什么建议或解决方法吗？

Answer 1

 df['count'] = pd.to_numeric(df['count'], errors='coerce').astype('Int64')

熊猫将空字符串转换为整数

1 个答案: