spark ML管道为不同的运行提供不同的值

时间:2018-06-10 09:51:08

标签: apache-spark pyspark apache-spark-mllib

我有一个具有相同输入(缓存数据帧)的管道, 我每次只改变一个阶段(算法本身,逻辑回归,随机森林等) 做适合训练的数据+转换测试数据,并检查日志likelihhod预测。 我正在通过使用试验栏获得它,据我所知 - 这个专栏不应该通过管道改变,它是一个提供的值,只是预测应该改变。 我错过了什么?

finish training data, start transform data
total liklihood for algorithm : logistic_regression
+------------------+---------+--------------------+
|            lk_sum|lk_trials|     final_liklihood|
+------------------+---------+--------------------+
|-1181.226211424361|  21855.0|-0.05404832813655278|
+------------------+---------+--------------------+

finish training data, start transform data
total liklihood for algorithm : GBTRegressor
]+------------------+---------+--------------------+
|            lk_sum|lk_trials|     final_liklihood|
+------------------+---------+--------------------+
|-794.1915302136496|  21376.0|-0.03715342113649184|
+------------------+---------+--------------------+

finish training data, start transform data
 total liklihood for algorithm : random_forest
+-------------------+---------+--------------------+
|             lk_sum|lk_trials|     final_liklihood|
+-------------------+---------+--------------------+
|-1494.7763490404889|  22509.0|-0.06640794122530938|

0 个答案:

没有答案