我有一个数据帧(df),我使用代码从excel文件中读取:
xls_file = pd.ExcelFile('/Users/Desktop/df_1.xlsx')
df = xls_file.parse('Sheet1')
数据帧的较短版本被转置并显示如下:
0 1 2 3 4
S_Id SB001 SB001 SB001 SB001 SB001
Reg 27548 27548 27548 27548 27548
Visit 1 2 1 2 3
Planned 5 NaN 5 NaN NaN
Planned_2 NaN NaN NaN NaN NaN
Visit_Date 15-07-22 15-10-01 15-07-22 15-10-01 16-08-01
Weight 69 70 69 70 68.3
Height 170 NaN 170 NaN NaN
Consent 1 NaN 1 NaN NaN
Filled_Q1 1 1 1 1 1
Filled_Q2 1 1 1 1 1
Other_Id NaN NaN NaN NaN NaN
Class1_Taken 1 1 1 1 1
Class1_Date 15-07-22 15-10-01 15-07-22 15-10-01 16-08-01
Class2_Taken 1 1 1 1 1
Class2_Date 15-07-22 15-10-01 15-07-22 15-10-01 16-08-01
Class2_Time 11:05 11:55 11:05 11:55 14:05
Class3_Taken 1 1 1 1 1
Class3_Date 15-07-22 15-10-01 15-07-22 15-10-01 16-08-01
Class3_Time 10:50 10:45 10:50 10:45 13:20
Class4_Taken 1 1 1 1 1
Class5_Taken 1 1 1 1 1
Class6_Taken 1 1 1 1 1
Class7_Taken 1 1 1 1 1
Class8_Taken 0 0 0 0 0
现在,如果我使用.duplicated()方法,它将生成:
0 False
1 False
2 False
3 False
4 False
dtype: bool
这不是真的,因为0,2和1,3是相同的。
有人可以帮忙吗?提前谢谢!