Sklearn的模型花费太多时间在Python中只有很少的数据

时间:2017-02-12 18:13:17

标签: python scikit-learn svm data-science bigdata

我一直在使用sklearn的模型(SVM,Logistic回归,MLP,......),直到昨天我没有任何问题,但我不知道为什么,目前,当我尝试安装模型时,这需要花费大量时间。

例如,对于具有6个特征的551个样本,尝试使用多项式内核拟合支持向量机并改变参数:

  • C = 1.00度= 1.00 Coef = 0.000 Gamma = 0.25 15.124秒。
  • C = 1.00度= 1.00 Coef = 0.000 Gamma = 0.75 22.937秒。
  • C = 1.00度= 1.00 Coef = 3.000 Gamma = 0.25 11.703秒
  • C = 1.00度= 1.00 Coef = 3.000 Gamma = 0.75 18.81秒
  • C = 1.00度= 2.00 Coef = 0.000 Gamma = 0.25 316.115秒
  • C = 1.00度= 2.00 Coef = 0.000 Gamma = 0.75 74.530秒
  • C = 1.00度= 2.00 Coef = 3.000 Gamma = 0.25 270.514秒
  • C = 1.00度= 2.00 Coef = 3.000 Gamma = 0.75 357.194秒

    from sklearn import svm
    
    for C in C_values:
        for degree in degree_values:
            for coef in coef_values:
                for gamma in gamma_values:
                    print('C = {0:.2f} Degree = {1:.2f} Coef = {2:.3f} Gamma = {3:.2f} '.format(C,degree,coef,gamma))
                    clf = svm.SVC(C = C, degree = degree, gamma = gamma, coef0 = coef, kernel = 'poly', decision_function_shape = decision_function_shape )
                    time_i = time()
                    clf.fit(X_train, Y_train)
                    time_f = time()
                    print('str(time_f - time_i) '+'seconds')
    

我从昨天到今天都没有做任何改变,我认为用550个样本和6个特征来模拟5分钟是夸张的。

  

什么可以产生这种增加的时间?

由于

修改

我一直尝试使用random_state参数不同的时间,结果类似:

  • C = 1.00度= 1.00 Coef = 0.000 Gamma = 0.25 ...... 7.359 scnds
  • C = 1.00度= 1.00 Coef = 0.000 Gamma = 0.75 ...... 26.156 scnds
  • C = 1.00度= 1.00 Coef = 3.000 Gamma = 0.25 ...... 8.781 scnds
  • C = 1.00度= 1.00 Coef = 3.000 Gamma = 0.75 ...... 16.437 scnds
  • C = 1.00度= 2.00 Coef = 0.000 Gamma = 0.25 ..... 259.248 scnds
  • C = 1.00度= 2.00 Coef = 0.000 Gamma = 0.75 ...... 219.170 scnds
  • C = 1.00度= 2.00 Coef = 3.000 Gamma = 0.25 ...... 302.201 scnds
  • C = 1.00度= 2.00 Coef = 3.000 Gamma = 0.75 ...... 345.163 scnds

0 个答案:

没有答案