Question

背景

我有一个大约200个国家（行）在不同时间段（列）的数据集。此数据集的Pandas数据帧如下：

data = {'Country': ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'],
        '1958-1962': [0, 0, 0, 0, 0],
        '2008-2012': [0.0, 0.0, 8.425, 0.0, 0.0],
        '2013-2017': [0.0, 0.0, 10.46, 0.0, 0.0]}

df = pd.DataFrame(data)

     Country  1958-1962  2008-2012  2013-2017
 Afghanistan          0      0.000       0.00
     Albania          0      0.000       0.00
     Algeria          0      8.425      10.46
     Andorra          0      0.000       0.00
      Angola          0      0.000       0.00

我正在尝试使用以下代码获取每一列中所有值的总和。

y_data = []

period_list = list(df)
period_list.remove('Country')

for x in period_list:
    y_data.append(df[x].sum())

错误

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Process finished with exit code 1

由于某种原因，Pandas在求和过程中也包含了标头。我该如何解决？

其他测试

我使用df.sum()在以下数据帧上测试了求和函数，并适当地将每一列的数字总和分别为18、20、20、19。

df = pd.DataFrame({"A":[5, 3, 6, 4], 
                   "B":[11, 2, 4, 3], 
                   "C":[4, 3, 8, 5], 
                   "D":[5, 4, 2, 8]})

print(df.drop("Country",axis=1).dtypes)的输出如下：

1958-1962    object
1963-1967    object
1968-1972    object
1973-1977    object
1978-1982    object
1983-1987    object
1988-1992    object
1993-1997    object
1998-2002    object
2003-2007    object
2008-2012    object
2013-2017    object
dtype: object

解决方案

我使用df = df.apply(pd.to_numeric, errors='ignore')将对象转换为数字，从而解决了该问题。

Answer 1

将要求和的列从对象转换为数字，然后删除“国家/地区”列，然后对其余列求和。

请为converting from object to numeric

引用此链接

熊猫总和包括列标题

背景

错误

其他测试

解决方案

1 个答案: