python - 机器学习中训练，测试（Dev）和验证分数的解释

As you explained in the comments, your test set is the set you used to tune your parameters and the validation set is the set that your model didn't use for training.
Considering that, it's natural that your Validation scores are lower than other scores.
When you're training a machine learning model, you show the training set to your model, that's why your model get's the best scores on training set, i.e. samples it has already seen and knows the answer for.
You use validation set to tune your parameters (e.g. degree of complexity in regression and so on) so your parameters are fine tuned for your validation sets but your model has not been trained on them. (for this you used the term test set, and to be fair they are sometimes used that way)
finally you have the least score on your test set which is natural since the parameters are not exactly tuned for the test set and the model has never seen them before.
if there is a huge hap between your training and test results, your model might have become overfit and there are ways to avoid that.

hope this helped ;)

机器学习中训练，测试（Dev）和验证分数的解释

1 个答案: