对于我的实验,我正在使用KNN对一些数据集进行分类(为重现性共享here)。这是我的源代码。
import numpy as np
from numpy import genfromtxt
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
types = {
"Data_G": ["datag_s.csv", "datag_m.csv"],
"Data_V": ["datav_s.csv", "datav_m.csv"],
"Data_C": ["datac_s.csv", "datac_m.csv"],
"Data_R": ["datar_s.csv", "datar_m.csv"]
}
dataset = None
ground_truth = None
for idx, csv_list in types.items():
for csv_f in csv_list:
col_time,col_window = np.loadtxt(csv_f,delimiter=',').T
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_inds = np.where(leading_window < trailing_window)[0]
beta_value = leading_window[decreasing_inds]/trailing_window[decreasing_inds]
quotient_times = col_time[decreasing_inds]
my_data = genfromtxt(csv_f, delimiter=',')
my_data = my_data[:,1]
my_data = my_data[:int(my_data.shape[0]-my_data.shape[0]%200)].reshape(-1, 200)
labels = np.full(1, idx)
if dataset is None:
dataset = beta_value.reshape(1,-1)[:,:15]
else:
dataset = np.concatenate((dataset,beta_value.reshape(1,-1)[:,:15]))
if ground_truth is None:
ground_truth = labels
else:
ground_truth = np.concatenate((ground_truth,labels))
X_train, X_test, y_train, y_test = train_test_split(dataset, ground_truth, test_size=0.25, random_state=42)
knn_classifier = KNeighborsClassifier(n_neighbors=3, weights='distance', algorithm='auto', leaf_size=300, p=2, metric='minkowski')
knn_classifier.fit(X_train, y_train)
当我执行以下操作
plot_data=dataset.transpose()
plt.plot(plot_data)
它产生以下图。
我将图例添加为以下内容:
plt.plot(plot_data, label=idx)
plt.legend()
但是,正如所看到的,它正在用Data_R
替换所有图例。我在这里做什么错了?
答案 0 :(得分:1)
在回答这个问题之前,我要说的一件事是,在遍历字典时,我始终会保持谨慎。在Python 3.6之前,不对字典进行排序,因此,如果需要保证字典中的顺序,则应使用OrderedDict
。如果您运行的是Python3.6 +,则无需担心。反正...
在for循环for idx, csv_list in types.items():
之后,我们将始终拥有idx = "Data_R"
(假设您的字典已排序)。
因此,当您使用plt.plot(plot_data, label=idx)
进行绘制时,所有线条的标签都将设置为"Data_R"
。
相反,您应该在行上循环,并一次将标签添加到其中。
for i, key in enumerate(types.keys()):
plt.plot(plot_data[:, 2*i], label=key)
plt.plot(plot_data[:, 2*i+1], label=key)
plt.legend()