从两列计算和创建百分比列

时间:2016-03-31 11:29:19

标签: python string pandas dataframe percentage

我有一个df(Apple_farm)并且需要根据在两列(Good_applesTotal_apples)中找到的值计算百分比,然后将结果值添加到a Apple_farm中的新专栏名为' Perc_Good'。

我试过了:

Apple_farm['Perc_Good'] = (Apple_farm['Good_apples'] / Apple_farm['Total_apples']) *100

然而,这会导致此错误:

  

TypeError:/:' str'不支持的操作数类型和' str'

否则

Print Apple_farm['Good_apples']Print Apple_farm['Total_apples']

产生一个带有数值的列表,但是将它们分开似乎会导致它们被转换为字符串?

我还试图定义一个新函数:

def percentage(amount, total):
    percent = amount/total*100
    return percent

但不确定如何使用它。

任何帮助都会受到赞赏,因为我对Python和熊猫都很陌生!

1 个答案:

答案 0 :(得分:3)

我认为您需要将string列转换为floatint,因为他们的typestring(但看起来像数字):

Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(float)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(float)

Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(int)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(int)

样品:

import pandas as pd

Good_apples = ["10", "20", "3", "7", "9"]
Total_apples = ["20", "80", "30", "70", "90"]
d = {"Good_apples": Good_apples, "Total_apples": Total_apples}
Apple_farm = pd.DataFrame(d)
print Apple_farm 
  Good_apples Total_apples
0          10           20
1          20           80
2           3           30
3           7           70
4           9           90

print Apple_farm.dtypes
Good_apples     object
Total_apples    object
dtype: object

print Apple_farm.at[0,'Good_apples']
10

print type(Apple_farm.at[0,'Good_apples'])
<type 'str'>
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(int)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(int)

print Apple_farm.dtypes
Good_apples     int32
Total_apples    int32
dtype: object

print Apple_farm.at[0,'Good_apples']
10

print type(Apple_farm.at[0,'Good_apples'])
<type 'numpy.int32'>
Apple_farm['Perc_Good'] = (Apple_farm['Good_apples'] / Apple_farm['Total_apples']) *100

print Apple_farm
   Good_apples  Total_apples  Perc_Good
0           10            20       50.0
1           20            80       25.0
2            3            30       10.0
3            7            70       10.0
4            9            90       10.0