我有一个.CSV文件,其中包含来自以下网站的LSAT答案:
"Test","Question","Section","Question Type","Your Answer","Correct Answer"
"PT 62","2: 1","LR","Best Principle for Example","D","D"
"PT 62","2: 2","LR","Strengthen","E","E"
"PT 62","2: 3","LR","Direct Logic Link","B","B"
... repeat 1,000x
我想开始从此.CSV文件中获取数据,以便我可以弄清我正确的“问题类型”问题的百分比(或多少)。
我已经查阅了Python手册,许多类似类型的论坛问题以及很多.count答案,但是似乎都没有我想要做的事情。
import csv
import itertools
import json
from collections import Counter
file = open('C:/Users/Kenny/Downloads/logicReasoning.csv')
reader = csv.reader(file)
data = list(reader)
masterList = []
questionTypes =[]
for row in data:
masterList.append(row[3])
for x in masterList:
c = Counter(x)
masterList.count(x)
print("total "+x+":", masterList.count(x))
输出
total Justify: 28
total Definition: 28
total Most Similar in Flawed Reasoning: 14
total Resolve Discrepancy: 24
etc, for each question type.
上面的代码打印出“问题类型”的列表,以及在masterList中计数的次数,每出现一次问题类型
这样,“ Justify 28”被打印28次,每次在CSV文件中出现一次。
我只希望将“ Justify”打印一次,并将其总数存储在CSV文件中。
然后,我将为“问题类型”重新执行相同的代码,以便创建一个新的空列表,并仅在正确的情况下附加每个实例-给出为:
if row[4] == row[5]:
correctList.append(row[3])
这是按问题类型计数总问题并按问题类型计数正确的总问题的正确方法,这样我便可以得出百分比和其他数据了吗?
答案 0 :(得分:1)
我想开始从此.CSV文件中获取数据,以便我可以弄清我正确的“问题类型”问题的百分比(或多少)。
使用pandas
可以很容易地完成这些任务,我建议您尝试一下该库。我将向您简要演示如何使用pandas.DataFrame
。
演示
import pandas as pd
demo = pd.DataFrame(
[['A', 'one', 'two'],
['B', 'foo', 'bar'],
['A', 'fizz', 'fizz'],
['A', 'buzz', 'buzz']],
columns=['Question Type', 'Your Answer', 'Correct Answer'])
print(demo)
print()
demo['is_correct'] = demo['Your Answer'] == demo['Correct Answer']
print(demo)
print()
correct_answers = demo.groupby(['Question Type', 'is_correct']).size()
print(correct_answers)
输出
Question Type Your Answer Correct Answer
0 A one two
1 B foo bar
2 A fizz fizz
3 A buzz buzz
Question Type Your Answer Correct Answer is_correct
0 A one two False
1 B foo bar False
2 A fizz fizz True
3 A buzz buzz True
Question Type is_correct
A False 1
True 2
B False 1
dtype: int64
在实际代码中,您可以使用pandas.read_csv
来读取csv文件,而不用手动输入DataFrame
初始化。
答案 1 :(得分:1)
import csv
from collections import Counter
counter = Counter()
with open('lsat.csv') as fp:
for row in csv.reader(fp):
counter[row[3]] += 1
print(counter)
print(list(counter.keys()))
print(counter['Strengthen'])