我创建了一个文档术语矩阵,如下所示:
javax.crypto.BadPaddingException: Given final block not properly padded
at main.decrypt(main.java:98)
at main.main(main.java:26)
... 9 more
Caused by: javax.crypto.BadPaddingException: Given final block not properly padded
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:966)
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:824)
at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:436)
at javax.crypto.Cipher.doFinal(Cipher.java:2048)
at main.decrypt(main.java:95)
在拿到它的列总和之后它给了我。
inspect(dtm[1:4,1:6])
allowed allowing almost alone companyunder companywide
Doc1.txt 1 1 1 0 1 0
Doc2.txt 0 1 1 0 1 1
Doc3.txt 0 0 0 1 0 1
Doc4.txt 1 0 1 0 1 1
这实际上表明这些单词可以在多少文档中找到(例如,允许2告诉我允许在两个文档中找到。)。
我很难创建一个频率分布图,它将x轴作为文档编号,y轴作为文档包含的字数。
答案 0 :(得分:0)
这是你要找的吗?
dtm = array(c(1,0,0,1,1,1,0,0,1,1,0,1,0,0,1,0,1,1,0,1,0,1,1,1),dim=c(4,6))
dimnames(dtm) = list(c("Doc1","Doc2","Doc3","Doc4"),c("allowed","allowing","almost","alone","companyunder","companywide"))
print(dtm)
plot(rowSums(dtm))