代码:
dataset = pd.read_csv("/Users/Akshita/Desktop/EE660/donor_raw_data_medmean.csv", header=None, names=None)
# Separate data and label
X_label = dataset[1:19373][0]
X_data = dataset[1:19373]
print(X_data[X_label==1])
我得到输出:(实际上有4000个样本,标签= 1)
0 1 2 3 4 5 6 7 8 9 ... 51 52 53 54 55 56 57 58 \
16386 1 17 60 0 1 0 0 0 0 1 ... 0 20 20 20 5 10 15 15
16396 1 137 60 0 1 0 0 0 0 1 ... 15 25 10 15 6 14 16 120
16399 1 89 54 0 1 0 0 0 0 1 ... 10 15 5 15 6 14 16 79
16402 1 89 75 0 1 0 0 0 0 1 ... 25 35 10 35 6 13 15 79
..
..
19356 1 101 80 1 0 0 1 0 0 2 ... 25 30 5 28 7 16 18 101
19363 1 65 70 1 0 0 1 0 0 1 ... 7 12 5 10 4 8 20 63
19372 1 29 70 0 0 0 1 0 0 2 ... 0 25 25 25 4 9 24 24
..
[859 rows x 61 columns]
和
print(X_data[X_label==0])
我得到输出:(大约有15000个样本,标签= 0)
0 1 2 3 4 5 6 7 8 9 ... 51 52 53 54 55 56 57 58 \
16384 0 17 74 0 1 0 0 0 0 1 ... 0 15 15 15 4 10 17 17
16385 0 17 60 0 1 0 0 0 0 2 ... 0 15 15 15 4 11 17 17
16387 0 29 67 0 1 0 0 0 0 1 ... 0 20 20 20 5 11 23 28
16388 0 53 60 0 1 0 0 0 0 1 ... 5 30 25 30 5 11 26 52
16389 0 65 49 0 1 0 0 0 0 1 ... 30 35 5 27 6 13 16 56
..
..
19369 0 137 77 1 0 1 0 0 0 1 ... 9 10 1 10 6 13 21 130
19370 0 29 60 1 0 0 1 0 0 1 ... 0 15 15 15 3 9 23 23
19371 0 129 78 1 0 0 1 0 0 2 ... 20 25 5 25 7 24 8 129
我可以做错什么?