我无法理解输出,也许有人可以提供帮助
cvb命令
bin/mahout cvb \
-i /work/matrix \
-o /work/cvb -k 10 -ow -x 20 \
-dict /work/sparseVectors/dictionary.file-* \
-dt /work/topics \
-mt /work/models
cvb和vector dump
之后bin/mahout vectordump -i /work/topics \
-d /work/sparseVectors/dictionary.file-* \
-o /work/cvb-topic \
-dt sequencefile --vectorSize 10 \
-sort /work/topics \
-p TRUE
文件cvb-topic类似于
0 {0:0.5380152557598438,09:0.4619846630645179,10:1.541304295897372E-8,08:1.5405183424669223E-8,04:1.4964621316424798E-8,01:1.0302427985842305E-8,00:7.566567734425231E-9,1:7.394593812516846E-9,1.180:6.745071922943742E-9,07:3.3841292288528656E-9}
接下来我该怎么办?
答案 0 :(得分:0)
我今天遇到了同样的问题,为了获得单词的概率,请使用-i /work/cvb