对于我的最终计算Python任务,我被要求用Python编写一个数据库程序,这将允许我访问三个类数据库,每个数据库包含一个参加算术测验的学生的三个分数。有三种方法必须对代码进行排序;按字母顺序使用名字,作为平均值,通过将所有三个分数相加并除以三来找到唯一值,并将分数从最高分数排序到最低分数。 因此,假设以下是其中一个CSV文件:
name1 name2 score1 score2 score3
Atticus Finch 9 8 10
Jem Finch 5 7 6
Jean Louise Finch 3 2 4
如果最终用户希望按字母顺序排序,这就是它在Python IDLE GUI上的样子:
Atticus Finch 9 8 10
Jean Louise Finch 3 2 4
Jem Finch Finch 5 7 6
如果最终用户希望它按平均值排序,那么它应该是这样的:
Atticus Finch 9
Jem Finch 6
Jean Louise Finch 3
如果最终用户希望它从最高到最低排序,那么它应该是这样的:
Atticus Finch 10 9 8
Jem Finch 7 6 5
Jean Louise Finch 4 3 2
现在这是我的代码目前的样子:
print("Welcome to the Database sorter. The system works based on the following functions. Choose your class by inputting a letter, and choose the method of sorting the data by inputing a number afterwards. A is for Class A, B is for Class B and C is the Class C.1 is for soritng the data as an average, 2 is for sorting the data in alphabetical order and 3 is for sorting the data from highest to lowest.")
classanddatasorter =''
while classanddatasorter not in ["A1","A2","A3","B1","B2","B3","C1","C2","C3"]:
classanddatasorter = input("You have the following nine options. Input A1 to sort the results of Class A as an average. Input A2 to sort the results of Class A in alphabetical order. Input A3 to sort the results of Class A from highest to lowest. Input B1 to sort the results of Class B as an average. Input B2 to sort the results of Class B in alphabetical order. Input B3 to sort the results of Class B from highest to lowest. Input C1 to sort the results of Class C as an average. Input C2 to sort the results of Class C in alphabetical order. Input C3 to sort the results of Class C from highest to lowest. ")
if classanddatasorter == "A1":
df = pd.read_csv('classa.csv')
df[["score1", "score2","score3"]].mean(axis=1)
elif classanddatasorter == "A2":
df = pd.read_csv('classa.csv')
saved_column = df.column_name
name = df.name
name.sort
elif classanddatasorter == "A3":
df = pd.read_csv('classa.csv')
df.sort[('score1','score2','score3'], ascending=False)
elif classanddatasorter == "B1":
df = pd.read_csv('classb.csv')
df[["score1", "score2","score3"]].mean(axis=1)
elif classanddatasorter == "B2":
df = pd.read_csv('classb.csv')
saved_column = df.column_name
name = df.name
elif classanddatasorter == "B3":
df = pd.read_csv('classb.csv')
df.sort[('score1','score2','score3'], ascending=False)
elif classanddatasorter == "C1":
df = pd.read_csv('classc.csv')
df[["score1", "score2","score3"]].mean(axis=1)
elif classanddatasorter == "C2":
bamboo = pd.read_csv('classc.csv')
saved_column = df.column_name
name = df.name
name.sort
elif classanddatasorter == "C3":
df = pd.read_csv('classc.csv')
df.sort[('score1','score2','score3'], ascending=False)
到目前为止我收到了以下错误:
尝试将代码排序为平均值:
Traceback (most recent call last):
File "C:\Users\MVMCJK\Downloads\Python code\Seperate independent draft of Task 3 (not intergated with Task 1 and 2) draft 3.py", line 70, in <module>
df[["score1", "score2","score3"]].mean(axis=1)
File "C:\Users\MVMCJK\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1791, in __getitem__
return self._getitem_array(key)
File "C:\Users\MVMCJK\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1835, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)
File "C:\Users\MVMCJK\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1112, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: "['score1' 'score2' 'score3'] not in index"
尝试按字母顺序对代码进行排序:
Traceback (most recent call last):
File "C:\Users\MVMCJK\Downloads\Python code\Seperate independent draft of Task 3 (not intergated with Task 1 and 2) draft 3.py", line 74, in <module>
saved_column = df.column_name
File "C:\Users\MVMCJK\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2150, in __getattr__
(type(self).__name__, name))
AttributeError: 'DataFrame' object has no attribute 'column_name'
最后一部分甚至不能远程工作:由于语法无效,它拒绝默认运行,我必须消除它以使程序正常工作,当我输入A3时甚至不给出响应。 我已经尝试了谷歌搜索KeyError和AttributeError但我找不到任何与我的问题相关的东西,让我找到了进一步的修复。有谁知道我的节目有什么好笑的?任何帮助将不胜感激。
编辑:更新了仍未运行的代码:
print("Welcome to the Database sorter. The system works based on the following functions. Choose your class by inputting a letter, and choose the method of sorting the data by inputing a number afterwards. A is for Class A, B is for Class B and C is the Class C.1 is for soritng the data as an average, 2 is for sorting the data in alphabetical order and 3 is for sorting the data from highest to lowest.")
classanddatasorter =''
while classanddatasorter not in ["A1","A2","A3","B1","B2","B3","C1","C2","C3"]:
classanddatasorter = input("You have the following nine options. Input A1 to sort the results of Class A as an average. Input A2 to sort the results of Class A in alphabetical order. Input A3 to sort the results of Class A from highest to lowest. Input B1 to sort the results of Class B as an average. Input B2 to sort the results of Class B in alphabetical order. Input B3 to sort the results of Class B from highest to lowest. Input C1 to sort the results of Class C as an average. Input C2 to sort the results of Class C in alphabetical order. Input C3 to sort the results of Class C from highest to lowest. ")
if classanddatasorter == "A1":
df = pd.read_csv('classa.csv')
df['average'] = df[['score1', 'score2', 'score3']].mean(axis=1)
elif classanddatasorter == "A2":
df = pd.read_csv('classa.csv', index_col='name1')
saved_column = df.column_name
name = df.name
name.sort
elif classanddatasorter == "A3":
df = pd.read_csv('classa.csv')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
elif classanddatasorter == "B1":
df = pd.read_csv('classb.csv')
df['average'] = df[["score1", "score2","score3"]].mean(axis=1)
elif classanddatasorter == "B2":
df = pd.read_csv('classb.csv',index_col='name1')
saved_column = df.column_name
name = df.name
elif classanddatasorter == "B3":
df = pd.read_csv('classb.csv')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
elif classanddatasorter == "C1":
df = pd.read_csv('classc.csv')
df['average'] = df[["score1", "score2","score3"]].mean(axis=1)
elif classanddatasorter == "C2":
df = pd.read_csv('classc.csv',index_col='name1')
saved_column = df.column_name
name = df.name
df = name.sort
elif classanddatasorter == "C3":
df = pd.read_csv('classc.csv')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
编辑2:更新了一些bakkal的代码示例。
print("Welcome to the Database sorter. The system works based on the following functions. Choose your class by inputting a letter, and choose the method of sorting the data by inputing a number afterwards. A is for Class A, B is for Class B and C is the Class C.1 is for soritng the data as an average, 2 is for sorting the data in alphabetical order and 3 is for sorting the data from highest to lowest.")
classanddatasorter =''
while classanddatasorter not in ["A1","A2","A3","B1","B2","B3","C1","C2","C3"]:
classanddatasorter = input("You have the following nine options. Input A1 to sort the results of Class A as an average. Input A2 to sort the results of Class A in alphabetical order. Input A3 to sort the results of Class A from highest to lowest. Input B1 to sort the results of Class B as an average. Input B2 to sort the results of Class B in alphabetical order. Input B3 to sort the results of Class B from highest to lowest. Input C1 to sort the results of Class C as an average. Input C2 to sort the results of Class C in alphabetical order. Input C3 to sort the results of Class C from highest to lowest. ")
if classanddatasorter == "A1":
df = pd.read_csv('classa.csv')
print('Sorted by name1')
df.sort('name1')
print(df)
elif classanddatasorter == "A2":
df = pd.read_csv('classa.csv')
print('Sorted by average column')
df['average'] = df[['score1', 'score2', 'score3']].mean(axis=1)
print(df)
print(df[['name1', 'name2', 'average']].sort('average'))
elif classanddatasorter == "A3":
df = pd.read_csv('classa.csv')
print('Sorted scores')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
for i in xrange(0, scores.shape[1]):
column_name = 'rank{}'.format(i)
df[column_name] = scores[:, i]
print(df[['name1', 'name2', 'rank2', 'rank1', 'rank0']])
elif classanddatasorter == "B1":
df = pd.read_csv('classb.csv')
print('Sorted by name1')
df.sort('name1')
print(df)
elif classanddatasorter == "B2":
df = pd.read_csv('classb.csv')
print('Sorted by average column')
df['average'] = df[['score1', 'score2', 'score3']].mean(axis=1)
print(df)
print(df[['name1', 'name2', 'average']].sort('average'))
elif classanddatasorter == "B3":
df = pd.read_csv('classb.csv')
print('Sorted scores')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
for i in xrange(0, scores.shape[1]):
column_name = 'rank{}'.format(i)
df[column_name] = scores[:, i]
print(df[['name1', 'name2', 'rank2', 'rank1', 'rank0']])
elif classanddatasorter == "C1":
df = pd.read_csv('classc.csv')
print('Sorted by name1')
df.sort('name1')
print(df)
elif classanddatasorter == "C2":
df = pd.read_csv('classc.csv')
print('Sorted by average column')
df['average'] = df[['score1', 'score2', 'score3']].mean(axis=1)
print(df)
print(df[['name1', 'name2', 'average']].sort('average'))
elif classanddatasorter == "C3":
df = pd.read_csv('classc.csv')
print('Sorted scores')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
for i in xrange(0, scores.shape[1]):
column_name = 'rank{}'.format(i)
df[column_name] = scores[:, i]
print(df[['name1', 'name2', 'rank2', 'rank1', 'rank0']])
答案 0 :(得分:0)
假设我们有一个这样的CSV文件(在逗号后面留出空格,并用逗号分隔,否则你需要使用特定格式的CSV选项)
<强> scores.csv 强>
name1,name2,score1,score2,score3
Atticus,Finch,9,8,10
Jem,Finch,5,7,6
Jean Louise,Finch,3,2,4
我们阅读了CSV文件
df = pd.read_csv('scores.csv')
现在df
是:
name1 name2 score1 score2 score3
0 Atticus Finch 9 8 10
1 Jem Finch 5 7 6
2 Jean Louise Finch 3 2 4
和df.columns
是:
Index([u'name1', u'name2', u'score1', u'score2', u'score3'], dtype='object')
您可以看到df
有columns
但没有column_name
属性,因此您的错误低于
AttributeError:&#39; DataFrame&#39;对象没有属性&#39; column_name&#39;
现在让我们按字母顺序排序
df.sort('name1')
结果是:
name1 name2 score1 score2 score3
0 Atticus Finch 9 8 10
2 Jean Louise Finch 3 2 4
1 Jem Finch 5 7 6
你想要平均值,让我们添加一列
df['average'] = df[['score1', 'score2', 'score3']].mean(axis=1)
df
现在有一个你可以排序的新列!
name1 name2 score1 score2 score3 average
0 Atticus Finch 9 8 10 9
1 Jem Finch 5 7 6 6
2 Jean Louise Finch 3 2 4 3
如果您只想查看average
列
df[['name1', 'name2', 'average']].sort('average')
name1 name2 average
0 Atticus Finch 9
1 Jem Finch 6
2 Jean Louise Finch 3
您想要的最后一个分数排序有点棘手,因为数据不整齐/标准化,但这是一次尝试
scores = df[['score1', 'score2', 'score3']].values
scores
现在看起来像这样
array([[ 9, 8, 10],
[ 5, 7, 6],
[ 3, 2, 4]])
我们对scores
数组
scores.sort(axis=1)
array([[ 8, 9, 10],
[ 5, 6, 7],
[ 2, 3, 4]])
这些是您想要的排序分数,因此我们将它们放入我们的df
,我们必须为每个分数列执行此操作,因此我们可以使用scores.shape[1]
这是该2D数组中的列数
for i in xrange(0, scores.shape[1]):
column_name = 'rank{}'.format(i)
df[column_name] = scores[:, i]
现在我们的df
看起来像这样
name1 name2 score1 score2 score3 rank0 rank1 rank2
0 Atticus Finch 9 8 10 8 9 10
1 Jem Finch 5 7 6 5 6 7
2 Jean Louise Finch 3 2 4 2 3 4
获得你想要的显示器
df[['name1', 'name2', 'rank2', 'rank1', 'rank0']]
name1 name2 rank2 rank1 rank0
0 Atticus Finch 10 9 8
1 Jem Finch 7 6 5
2 Jean Louise Finch 4 3 2
您可以阅读this PDF paper
,详细了解整理数据基本上,如果例如,很多操作会更容易您的数据应如下所示
name, test, score
bob, 1, 10
bob, 2, 9
而不是
name, score1, score2
bob, 10, 9
import pandas as pd
df = pd.read_csv('scores.csv')
print('Original Data')
print(df)
print('Sorted by name1')
df.sort('name1')
print(df)
print('Sorted by average column')
df['average'] = df[['score1', 'score2', 'score3']].mean(axis=1)
print(df)
print(df[['name1', 'name2', 'average']].sort('average'))
print('Sorted scores')
scores = df[['score1', 'score2', 'score3']].values
scores.sort(axis=1)
for i in xrange(0, scores.shape[1]):
column_name = 'rank{}'.format(i)
df[column_name] = scores[:, i]
print(df[['name1', 'name2', 'rank2', 'rank1', 'rank0']])
您也可以将结果数据框保存到另一个print()
,而不是.csv
。 .to_csv('score_sorted_avg.csv')