正确的方法来分析字典列表中保存的python数据

时间:2017-10-12 09:01:11

标签: python-3.x list dictionary

刚进入python编程,我只是理解列表和词典。但是现在我已经走出了我的深度,我的谷歌已经让我失望了。

我写了一个小程序,从csv文件中读取分数

def main():
  with open('test_data.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    test_database = []
    for row in reader:
        test_database.append(dict(username=row['username'],
                             subject=row['subject'],
                             dificulty=row['dificulty'],
                             answers=row['answers'],
                             questions=row['questions'],     
                             percentage=row['percentage'],     
                             grade=row['grade']))    
    csvfile.close()
  print(test_database)

并使用以下csv文件 test_data.csv

username,subject,dificulty,answers,questions,percentage,grade
ian47,History,Hard,1,5,20.0,D
ian47,Computer Science,Medium,5,5,75.0,B

并生成以下数据

[{'username': 'ian47', 'dificulty': 'Hard', 'questions': '5', 'grade': 'D', 'percentage': '20.0', 'answers': '1', 'subject': 'History'}, {'username': 'ian47', 'dificulty': 'Medium', 'questions': '5', 'grade': 'B', 'percentage': '75.0', 'answers': '5', 'subject': 'Computer Science'}]

我的问题是如何操作这些数据的最佳方式,我真的很喜欢它格式化数据的方式并使其易于理解,但现在我很难检索和操作这些数据。

我希望执行以下操作: -

  • 显示单个人的所有结果
  • 显示每个难度的每个主题的单个人的最高/最低/平均分数
  • 显示每个主题和获得该主题的人的最高分

如果有人可以提供帮助并指出我正确的方向来解决其中一个或两个问题,我相信我能够为其余的问题解决这个问题

1 个答案:

答案 0 :(得分:0)

This question is probably too broad, but if I were you I would look into Pandas. With pd.read_csv you'll have something very much like (wrote original as t.csv):

>>> import pandas as pd 
>>> t = pd.read_csv('t.csv')
>>> t
  username           subject dificulty  answers  questions  percentage grade
0    ian47           History      Hard        1          5        20.0     D
1    ian47  Computer Science    Medium        5          5        75.0     B
>>> t[t['grade'] == 'B']
  username           subject dificulty  answers  questions  percentage grade
1    ian47  Computer Science    Medium        5          5        75.0     B
>>> t.at[0, 'username'] = 'frank'
>>> t
  username           subject dificulty  answers  questions  percentage grade
0    frank           History      Hard        1          5        20.0     D
1    ian47  Computer Science    Medium        5          5        75.0     B
>>> t['percentage'].max()
75.0
>>> t.groupby(['subject', 'username'])['percentage'].max()
subject           username
Computer Science  ian47       75.0
History           ian47       20.0
Name: percentage, dtype: float64
>>> 
>>> t.to_csv('t.csv')

Then, when I check the contents of t.csv, I get:

username,subject,dificulty,answers,questions,percentage,grade
ian47,History,Hard,1,5,20.0,D
ian47,Computer Science,Medium,5,5,75.0,B