使用Pipleline绘制精度,训练和测试进度

时间:2020-03-16 12:00:24

标签: python scikit-learn pipeline

我正在开发一种文本分类模型,用于将邮件分类为垃圾邮件/火腿,并想对模型进行训练和测试。我正在使用scikit Learn提供的Pipleline:

from sklearn.pipeline import Pipeline

然后,我导入了pipleline模型并开始了训练过程:

text_clf = Pipeline([('tfidf', TfidfVectorizer()),
                     ('clf', LinearSVC()),
])


text_clf.fit(X_train, y_train)  

现在,我想通过显示准确性的提高或Generell的训练进度来绘制模型训练的进度。

但是我不知道如何解决这个问题。

我已经在这里查看过此帖子:Keras - Plot training, validation and test set accuracy

有一个代码可能会有所帮助:

import keras
from matplotlib import pyplot as plt
history = model1.fit(train_x, train_y,validation_split = 0.1, epochs=50, batch_size=4)
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

所以我对代码做了一些修改:

history = text_clf.fit(X_train, y_train,validation_split = 0.1, epochs=50, batch_size=4)
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

但是很遗憾,我收到以下错误消息:

ValueError                                Traceback (most recent call last)
<ipython-input-46-59ac1bc1f4fa> in <module>
----> 1 history = text_clf.fit(X_train, y_train,validation_split = 0.1, epochs=50, batch_size=4)
      2 plt.plot(history.history['acc'])
      3 plt.plot(history.history['val_acc'])
      4 plt.title('model accuracy')
      5 plt.ylabel('accuracy')

~\Anaconda3\envs\abc\lib\site-packages\sklearn\pipeline.py in fit(self, X, y, **fit_params)
    348             This estimator
    349         """
--> 350         Xt, fit_params = self._fit(X, y, **fit_params)
    351         with _print_elapsed_time('Pipeline',
    352                                  self._log_message(len(self.steps) - 1)):

~\Anaconda3\envs\abc\lib\site-packages\sklearn\pipeline.py in _fit(self, X, y, **fit_params)
    278                     "pipeline using the stepname__parameter format, e.g. "
    279                     "`Pipeline.fit(X, y, logisticregression__sample_weight"
--> 280                     "=sample_weight)`.".format(pname))
    281             step, param = pname.split('__', 1)
    282             fit_params_steps[step][param] = pval

ValueError: Pipeline.fit does not accept the validation_split parameter. You can pass parameters to specific steps of your pipeline using the stepname__parameter format, e.g. `Pipeline.fit(X, y, logisticregression__sample_weight=sample_weight)`.

所以现在我不知道如何解决这个问题。

0 个答案:

没有答案