Question

我正在使用一个数据集，该数据集的一个列“价格”的值为“ $ 150.00”。为了删除“ $”，我使用了：

df.price = ([x.strip('$') for x in df.price])

有效。但是，此列仍保留为“对象”。因此，我的下一步是检查最高值以识别大于“ 1000.00”的任何值，该值可能表示为“ 1,000.00。我使用过：

print((df["price"]).sort_values(ascending=False))

，它返回的最高列表值为“ 999.00”。

然后，我尝试将object列“价格”转换为float。我用过：

df['price'] = df['price'].apply(np.float)

但是它返回了：

ValueError: could not convert string to float: '2,000.00'

该列旁边的数字不得大于999.00。我尝试使用以下方法删除所有“，”：

df.price = ([x.strip(',') for x in df.price])

然后我再次尝试：

df['price'] = df['price'].apply(np.float)

但是返回了相同的错误 ValueError: could not convert string to float: '2,000.00'

我不知道发生了什么，我在做什么错。

Answer 1

由于strip仅从字符串的左右两端删除字符，,位于中间，请考虑改用replace，这样的方法应该起作用：

df.price = ([x.replace(',', '') for x in df.price])

然后将它们变成floats。

Answer 2

您也可以通过申请来做到这一点：

df1.Value.apply(lambda x: np.float(str(x).replace(',','')))