我有一个从csv文件构建的数据框,该文件包含StudentID,名称和作业1,2,3 ... csv文件将作为输入输入,因此值可能会有所不同。
如果学生证不是唯一的,我想打印错误消息列表。下面的代码可以正常工作,因为gradesM3.csv中没有重复项:
grades = pd.read_csv('gradesM3.csv',sep=';')
duplicates = pd.concat(g for _, g in grades.groupby("StudentID") if len(g) > 1)
zipped = zip(duplicates['StudentID'])
for student in zipped:
print(f'The student ID {student} appears multiple times.')
但是,如果我更改CSV文件并创建一些重复的学生ID,则会收到以下错误:
ValueError: No objects to concatenate
我正在尝试编写一个代码,如果有重复则打印以下内容:
The student ID ('s123789',) appears multiple times.
The student ID ('s123789',) appears multiple times.
The student ID ('s123789',) appears multiple times.
如果没有,则显示以下内容:
There are no duplicates in your file.
我尝试了以下代码:
grades = pd.read_csv('gradesM3.csv',sep=';')
duplicates = pd.concat(g for _, g in grades.groupby("StudentID") if len(g) > 1)
if len(duplicates)>0:
zipped = zip(duplicates['StudentID'])
for student in zipped:
print(f'The student ID {student} appears multiple times.')
else:
print('The grades are correctly scaled along the 7-point grading system.')
但是我得到了相同的错误消息:
ValueError: No objects to concatenate.
在此先感谢您的帮助。
答案 0 :(得分:1)
您的问题是您的错误来自以下行:
duplicates = pd.concat(g for _, g in grades.groupby("StudentID") if len(g) > 1)
由于您在该行之后管理空情况,因此仍会发生错误。一种解决方案是使用try except
语法:
grades = pd.read_csv('gradesM3.csv',sep=';')
try:
duplicates = pd.concat(g for _, g in grades.groupby("StudentID") if len(g) > 1)
zipped = zip(duplicates['StudentID'])
for student in zipped:
print(f'The student ID {student} appears multiple times.')
except ValueError:
print('The grades are correctly scaled along the 7-point grading system.')
答案 1 :(得分:1)
一个更直接的解决方案是使用duplicated
大熊猫方法
import pandas as pd
# Example data
df = pd.DataFrame({'id' : [1,2,2,4, 5, 1], 'name' : ["a", "b", "b", "d", "e", "a"]})
print(df)
# id name
#0 1 a
#1 2 b
#2 2 b
#3 4 d
#4 5 e
#5 1 a
# Get the duplicates - each df row where th eid column is duplicated
df_duplicates = df[df['id'].duplicated()]
for id in df_duplicates['id']:
print(f"Student {id} is a duplicate")
#Student 2 is a duplicate
#Student 1 is a duplicate