我正在尝试使用mahout训练逻辑回归模型。命令行和输出如下所示:
mahout trainAdaptiveLogistic --passes 100 --input /home/cloudera/Desktop/final.csv --features 20 --output /home/cloudera/Desktop/model/adaptivemodel --target Action --categories 2 --predictors Open High Close --types n n n
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.7.1-job.jar
15/04/02 07:34:53 WARN driver.MahoutDriver: No trainAdaptiveLogistic.props found on classpath, will use command-line arguments only
20
Action ~ 0.000*Close + 0.000*High + 0.000*Open
Close 0.00003
High 0.00004
Open 0.00003
0.000000000 0.000033367 0.000036516 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000034630 0.000000000 0.000000000
15/04/02 07:38:30 INFO driver.MahoutDriver: Program took 216959 ms (Minutes: 3.6159833333333333)
我使用的文件的前几行是:
Open,High,Low,Close,Volume,Adj Close,Action
59.30,60.05,58.88,59.41,3373800,59.41,BUY
59.64,60.26,58.88,59.83,3069100,59.83,BUY
58.91,59.25,58.21,59.03,3559500,59.03,SELL
59.57,60.44,58.67,58.68,3302000,58.68,BUY
为什么系数如此之小,如此接近于零?尝试使用数字预测变量来预测单词是错误的吗?
答案 0 :(得分:1)
这类似于为预测变量“Action”选择错误类型的情况。在这种情况下,它应该是分类而不是纯文本。您可以尝试将二进制(0表示卖出,1表示买入)变量分配给数据,然后使用适当的特征编码器。