为什么此字符串不会转换为float?

时间:2019-06-07 14:00:04

标签: python pandas jupyter-notebook

“金额”列是一个字符串。我想将其更改为浮动,以便我可以将这些行输入到以后的计算中。

In [1] import pandas as pd

       data = pd.read_csv('input.csv')

       data
Out [1] 
    ID  Amount          Cost
0   A   9,596,249.09    1000000
1   B   38,385,668.57   50000
2   C   351,740.00      100
3   D   -               23
4   E   178,255.96      999

请注意,“ D”的金额为“-”,而不是零。

首先,我清理不良数据:

In [2]
    data['Amount'] = data['Amount'].replace(' -   ', 0)
    data
Out [2]
    ID  Amount          Cost
0   A   9,596,249.09    1000000
1   B   38,385,668.57   50000
2   C   351,740.00      100
3   D   0               23
4   E   178,255.96      999

然后我尝试使用2种方法将其转换为float。均不成功:

In [3]
    pd.Series(data['Amount']).astype(float)
Out [3]
    ValueError: could not convert string to float: '9,596,249.09'

和:

In [4]
    pd.to_numeric(data['Amount'])
Out [4]
    ValueError: Unable to parse string "9,596,249.09" at position 0

在绝望中,我试图遍历所有行:

In [5]
    def cleandata(x):
        return float(x)

    data['Amount'] = data['Amount'].apply(cleandata)
Out [5]
    ValueError: could not convert string to float: '9,596,249.09'

感谢您可以提供的任何建议。我已经尝试了几个小时。谢谢。

3 个答案:

答案 0 :(得分:1)

尝试:

data = pd.read_csv('input.csv', thousands=',', decimal='.')

答案 1 :(得分:1)

您应该删除逗号,这样可以解决问题。试试这个:

data['Amount'] = data['Amount'].apply(lambda x: x.replace(",", "")) # take the commas away
data['Amount'] = data.Amount.astype(float) 

答案 2 :(得分:0)

创建列表(y)似乎可行。

In [1]:
import pandas as pd
data = pd.read_csv('input.csv')
y = list(data["Amount"])
y = [item.replace(" -   " , '0') for item in y]
y = [item.replace("," , '') for item in y]
data["Amount"] = y
data["Amount"] = pd.to_numeric(data['Amount'], errors='coerce')
data['Result'] = data["Amount"] - data["Cost"]
data

Out [1]:
    ID  Amount      Cost        Result
0   A   9596249.09  1000000     8596249.09
1   B   38385668.57 50000       38335668.57
2   C   351740.00   100         351640.00
3   D   0.00        23         -23.00
4   E   178255.96   999         177256.9

我肯定有一种更好,更Python化的方式来编写此代码^