scikit-learn(SVMLIB)中奇怪的SVM预测性能

时间:2013-03-29 13:49:20

标签: python svm scikit-learn

我在scikit-learn上使用SVC来处理10000x1000的大型数据集(10000个具有1000个功能的对象)。我已经在其他来源中看到SVMLIB不能超过~10000个对象,我确实观察到了这一点:

training time for 10000 objects: 18.9s
training time for 12000 objects: 44.2s
training time for 14000 objects: 92.7s

你可以想象当我尝试80000时会发生什么。但是,我发现非常令人惊讶的是SVM的predict()花费的时间比训练适合():

prediction time for 10000 objects (model was also trained on those objects): 49.0s
prediction time for 12000 objects (model was also trained on those objects): 91.5s
prediction time for 14000 objects (model was also trained on those objects): 141.84s

让预测在线性时间内运行是微不足道的(虽然这里可能接近线性),并且通常比训练快得多。那么这里发生了什么?

1 个答案:

答案 0 :(得分:2)

您确定没有将训练时间包括在预测时间的度量中吗?你有时间的代码片段吗?