Question

我正在尝试使用Python在csvfile中编写一些数据。我有一份国家和欧洲电视网Songcontest的结果列表，它看起来像这样：

Country,Points,Year
Belgium;181;2016
Netherlands;153;2016
Australia;511;2016
Belgium;217;2015
Australia;196;2015

等等。

总之，我想总结这些年来任何国家收到的积分总数，因此输出应该如下所示： '比利时：398'，'荷兰：153'，'澳大利亚：707'等。

这就是我的代码：

import csv
with open('euro20042016.csv', 'r') as csvfile:
    pointsallyears = []
    countriesallyears = []
    readFILE = csv.reader(csvfile, delimiter=';')
    for row in readFILE:
        countriesallyears.append(row[0])
        pointsallyears.append(row[1])
csvfile.close()

results = []
for result in pointsallyears:
    result = int(result)
    results.append(result)

scorebord = zip(countriesallyears,results)

所以我已经确定结果/点数是实际整数，我过滤掉了第三行（年份），但我不知道如何从这里开始。非常感谢提前！

Answer 1

将@ Mikk的评论放入实际答案中。除import

之外的两行

import pandas as pd
df = pd.read_csv('euro20042016.csv', sep = ';')
print df.groupby('Country')['Points'].sum()

您唯一需要做的就是将文件的第一行更改为;而不是,。

Answer 2

我稍微更改了您的代码以使用字典并使用国家/地区名称作为键。在结果字典中，d将国家/地区名称作为键，值为总点数。

import csv

d = dict()

with open('euro20042016.csv', 'r') as csvfile:
    readFILE = csv.reader(csvfile, delimiter=';')
    print (readFILE)
    c_list = []
    for row in readFILE:
        if row[0] in c_list:
            d[row[0]] = d[row[0]] + int(row[1])
        else:
            c_list.append(row[0])
            d[row[0]] = int(row[1])
csvfile.close()

print(d)

Answer 3

我决定用你的代码玩一下，这就是我想出来的。此处，row[0]包含国家/地区名称，row[1]包含我们需要的值。我们检查该国家是否已存在于我们用于维护聚合的字典中，如果我们没有创建它。

import csv
with open('euro20042016.csv', 'r') as csvfile:
score_dict={}
readFILE = csv.reader(csvfile, delimiter=';')
for row in readFILE:
    # Only rows with 3 elements have the data we need
    if len(row) == 3:
        if row[0] in score_dict:
            score_dict[row[0]]+=int(row[1])
        else:
            score_dict[row[0]]=int(row[1])
csvfile.close()
print score_dict

我得到的是输出

{'Belgium': 398, 'Australia': 707, 'Netherlands': 153}

我认为这是你的目标。

如果您在理解任何问题时遇到问题，请在评论中告诉我。

Answer 4

我有解决方法。但请确保您的euro20042016.csv文件与

相同

Belgium;181;2016
Netherlands;153;2016
Australia;511;2016
Belgium;217;2015
Australia;196;2015

并且此代码在列表中输出。像

[('Belgium', 398), ('Australia', 707), ('Netherlands', 153)]

代码在这里

try:
    f = open('euro20042016.csv', 'r+')
    s = f.read()

    lst = list(map(lambda x: x.split(';'), s.split('\n')))

    points, country = [], []
    for line in lst:
        points.append(int(line[1]))
        country.append(line[0])

    countrypoints = sorted(zip(country, points), key=lambda x: x[1])
    country = list(set(country))
    total = [0]*len(country)

    for rec in countrypoints:
        total[country.index(rec[0])] = total[country.index(
            rec[0])] + rec[1]
    f.close()
    finalTotal = list(zip(country, total))
    print finalTotal

except IOError as ex:
    print ex
except Exception as ex:
    print ex

我希望这会对你有所帮助。

Python：如何只对CSV文件中的整数求和，而只对某个变量的整数求和？

4 个答案: