Mahout CVB主题文件

时间:2014-03-06 14:20:30

标签: java mahout

我无法理解输出,也许有人可以提供帮助

cvb命令

bin/mahout cvb \
-i /work/matrix \
-o /work/cvb -k 10 -ow -x 20 \
-dict /work/sparseVectors/dictionary.file-* \
-dt /work/topics \
-mt /work/models

cvb和vector dump

之后
bin/mahout vectordump -i /work/topics \
-d /work/sparseVectors/dictionary.file-* \
-o /work/cvb-topic \
-dt sequencefile --vectorSize 10 \
-sort /work/topics \
-p TRUE

文件cvb-topic类似于

0   {0:0.5380152557598438,09:0.4619846630645179,10:1.541304295897372E-8,08:1.5405183424669223E-8,04:1.4964621316424798E-8,01:1.0302427985842305E-8,00:7.566567734425231E-9,1:7.394593812516846E-9,1.180:6.745071922943742E-9,07:3.3841292288528656E-9} 

接下来我该怎么办?

1 个答案:

答案 0 :(得分:0)

我今天遇到了同样的问题,为了获得单词的概率,请使用-i /work/cvb