无法将字符串列转换为在熊猫中浮动

时间:2019-09-08 05:55:55

标签: python pandas

我想在DataFrame收入中添加一个名为'2016 Salary($)'的新列,其中包含来自表Salary Paid的员工工资作为数字。我想通过删除'$'和','去除该数字。

但是当我这样做时,我得到了错误提示:

  

“无法将字符串转换为浮点数”

我尝试遵循提示,但不起作用:

income['2016 Salary ($)']= income['SalaryPaid'].str.strip('$').astype(float)
income['2016 Salary ($)'].apply(lambda X:X['Salary Paid'])
income

4 个答案:

答案 0 :(得分:2)

尝试这样的事情:

数据

dic = {'Name':['John','Peter'],'SalaryPaid':['$204,546,289.35','$500,231,289.35'],'Year':['2008','2009']}
df1 = pd.DataFrame(dic)
df1

    Name    SalaryPaid      Year
0   John    $204,546,289.35 2008
1   Peter   $500,231,289.35 2009

代码

df1['SalaryPaid'] = df1['SalaryPaid'].str.replace(',', '')
# If you want the result as a string : 
df1['2016 Salary ($)']= df1['SalaryPaid'].str.strip('$')
# if you want the result as float : 
#df1['2016 Salary ($)']= df1['SalaryPaid'].str.strip('$').astype(float) 


df1

结果

    Name    SalaryPaid  Year    2016 Salary ($)
0   John    $204546289.35   2008    204546289.35
1   Peter   $500231289.35   2009    500231289.35

答案 1 :(得分:2)

首先添加Series.str.replace

income['2016 Salary ($)']= income['SalaryPaid'].str.replace(',', '')
                                               .str.strip('$')
                                               .astype(float)

如果从文件创建DataFrame是在read_csv中使用thousands参数,则是更好的解决方案:

income = pd.read_csv(file, thousands=',')

income['2016 Salary ($)']= income['SalaryPaid'].str.strip('$').astype(float)

答案 2 :(得分:1)

我已根据您的要求创建了一个虚拟数据框,并执行了与您上面提到的相同的操作,对我来说效果很好。

import pandas as pd
df = pd.DataFrame(columns=['AA','BB'])
df['AA'] = ['$12,20','$13,30']
df['BB'] = ['X','Y']
print(df)

输出----->     AA BB 0 $ 12,20 X 1年$ 13,30是

df['AA'] = df['AA'].str.replace('$','').str.replace(',','').astype(float)
print(df)

输出-----> AA BB 01220.0 X 1 1330.0是

根据我的错误是在代码的第二行中您尝试应用lambda,而不是“ income ['2016 Salary($)']。apply(lambda X:X ['Salary Paid']) ”应该是“收入['2016 Salary($)']。apply(lambda X:X ['SalaryPaid'])”。我认为名为SalaryPaid的列存在输入错误。

答案 3 :(得分:0)

还可以:

def convert(x):
    return float(x.replace('$','').replace(',',''))

income['2016 Salary ($)'] = income['Salary Paid'].apply(convert)

def convert(x):
    return float(''.join(re.findall('[\d+\.]',x)))