解析CSV和分析数据

时间:2017-09-05 01:42:41

标签: python csv indexing

我正在通过Hacker Rank做一些功课,但我似乎无法弄清楚为什么它不接受我的回答。

Here is the link to the original repository.

目标是打印团队名称,其目标和目标允许值之间的差异最小。

似乎有两种可能性,莱斯特和阿斯顿_维拉,因为莱斯特在得分和允许进球之间存在负差(-37),而阿斯顿_维拉的绝对差异最小(-1)。然而,这些都不被接受。

有关为何的想法?

import sys
import os
import csv

text = '''Team,Games,Wins,Losses,Draws,Goals,Goals Allowed,Points
Arsenal,38,26,9,3,79,36,87
Liverpool,38,24,8,6,67,30,80
Manchester United,38,24,5,9,87,45,77
Newcastle,38,21,8,9,74,52,71
Leeds,38,18,12,8,53,37,66
Chelsea,38,17,13,8,66,38,64
West_Ham,38,15,8,15,48,57,53
Aston_Villa,38,12,14,12,46,47,50
Tottenham,38,14,8,16,49,53,50
Blackburn,38,12,10,16,55,51,46
Southampton,38,12,9,17,46,54,45
Middlesbrough,38,12,9,17,35,47,45
Fulham,38,10,14,14,36,44,44
Charlton,38,10,14,14,38,49,44
Everton,38,11,10,17,45,57,43
Bolton,38,9,13,16,44,62,40
Sunderland,38,10,10,18,29,51,40
Ipswich,38,9,9,20,41,64,36
Derby,38,8,6,24,33,63,30
Leicester,38,5,13,20,30,64,28'''

with open('football.csv', 'w') as f:
    f.write(text)



def read_data(filename):
    """Returns a list of lists representing the rows of the csv file data.

    Arguments: filename is the name of a csv file (as a string)
    Returns: list of lists of strings, where every line is split into a list of values. 
        ex: ['Arsenal', 38, 26, 9, 3, 79, 36, 87]
    """ 
    ifile = open('football.csv', 'rt')
    reader = csv.reader(ifile)

    listed = []
    for row in reader:
        print(row)
        listed.append(row)

    return listed

data = read_data('football.csv')

def get_index_with_min_abs_score_difference(goals):
    net_goals = []

    for i in goals[1:]:
        net_goals.append(int(i[5]) - int(i[6]))

    return net_goals.index(min(net_goals))+1

def get_team(index_value, parsed_data):
    return parsed_data[index_value][0]

footballTable = read_data('football.csv')
minRow = get_index_with_min_abs_score_difference(footballTable)
print(str(get_team(minRow, footballTable)))

我也尝试了替代解决方案(即目标得分与允许目标之间的绝对差异最小的团队)。

def get_index_with_min_abs_score_difference(goals):
    """Returns the index of the team with the smallest difference
    between 'for' and 'against' goals, in terms of absolute value.

    Arguments: parsed_data is a list of lists of cleaned strings
    Returns: integer row index
    """
    net_goals = []

    for i in goals[1:]:
        net_goals.append(abs(int(i[5]) - int(i[6])))

    return net_goals.index(min(net_goals))+1

1 个答案:

答案 0 :(得分:0)

这不是一个完全答案,但我对你的解决方案有一些评论。

你只是花了很多行来逐行读取csv文件把它放到一个列表中(稍后你将逐项处理),然后你就会有一些特殊的逻辑要跳过在标题行上。如果你使用csv.DictReader代替你的解决方案会更简单,只需直接使用生成的迭代器,而不是先尝试将其读入列表。考虑输出:

with open('football.csv', 'rt') as ifile:                                       
    footballTable = csv.DictReader(ifile)                                       
    for row in footballTable:                                                   
        print row

这将向您显示以下内容:

{'Draws': '3', 'Wins': '26', 'Losses': '9', 'Goals Allowed': '36', 'Points': '87', 'Games': '38', 'Goals': '79', 'Team': 'Arsenal'}
{'Draws': '6', 'Wins': '24', 'Losses': '8', 'Goals Allowed': '30', 'Points': '80', 'Games': '38', 'Goals': '67', 'Team': 'Liverpool'}
{'Draws': '9', 'Wins': '24', 'Losses': '5', 'Goals Allowed': '45', 'Points': '77', 'Games': '38', 'Goals': '87', 'Team': 'Manchester United'}
...

你会注意到:

  • 标题行将自动为您处理
  • 您现在可以按名称引用列,而不需要在代码中依赖魔术索引(i[5])。也就是说,您可以要求i['Goals']i['Goals Allowed']

只需在该循环中添加几行,您就可以找到解决问题的方法。