如何在R的tm包中保存完整的检测输出而不是10 * 10的样本矩阵?

时间:2017-08-06 09:52:58

标签: r text-mining tm

# My TermDocumentMatrix (TDM)
Nepal.tdm

# Structure of my TDM
str(Nepal.tdm)

# My locality vector
localities

# Structure of my locality vector
str(localities)
#chr [1:344] "kalyan" "surkhet" "chhinchu" "harre" "pyuthan" "thapdada" "khola" ...

# inspecting matching localities in my TDM
locality.matches <- inspect(Nepal.tdm[localities[localities %in% Terms(Nepal.tdm)], ])    

# I have tried following things but without success because the output is always 10 * 10 sample matrix when I want complete matrix of 200 * 92
as.data.frame(as.matrix(inspect(Nepal.tdm[localities[localities %in% Terms(Nepal.tdm)], ])))

capture.output(out <- data.frame(inspect(Nepal.tdm[localities[localities %in% Terms(Nepal.tdm)], ])))

1 个答案:

答案 0 :(得分:0)

从tm v.0.7开始,inspect.TermDocumentMatrix()显示样本而不是完整矩阵。完整的密集表示可通过as.matrix()获得。

# This will give you the entire matrix, from here you can filter it as you want
as.matrix(Nepal.tdm)