R中的MAXENT模型用于分类

时间:2014-05-08 05:01:24

标签: r svm document-classification maxent

我正在尝试使用R。

使用RTextTools包对文本进行分类

我使用 - SVM完成了此操作(以下代码可以正常工作:)

matrix[[i]] <- create_matrix(trainingdata[[i]][,1], language="english",removeNumbers=FALSE, stemWords=FALSE,weighting=weightTf,minWordLength=3)
container[[i]] <- create_container(matrix[[i]],trainingdata[[i]][,2],trainSize=1:length(trainingdata[[i]][,1]),virgin=FALSE)
models[[i]] <- train_models(container[[i]], algorithms=c("SVM"))

但是当我用MAXENT算法做同样的事情时

models[[i]] <- train_models(container[[i]], algorithms=c("MAXENT"))

它引发了我的错误:

Error in Module(module, mustStart = TRUE) : 
  function 'setCurrentScope' not provided by package 'Rcpp'  

当我做追溯时 - 得到以下细节

Module(module, mustStart = TRUE) 
.getModulePointer(x) 
maximumentropy$add_samples 
maximumentropy$add_samples 
train_maxent(feature_matrix, code_vector, l1_regularizer, l2_regularizer,  
maxent(container@training_matrix, as.vector(container@training_codes),  
train_model(container, algorithm, ...) 
train_models(container[[i]], algorithms = c("MAXENT")) 

更新

sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252    LC_MONETARY=English_Singapore.1252
[4] LC_NUMERIC=C                       LC_TIME=English_Singapore.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tm_0.5-10        hash_3.0.1       RTextTools_1.4.2 SparseM_1.03    

loaded via a namespace (and not attached):
 [1] bitops_1.0-6       caTools_1.16       class_7.3-9        e1071_1.6-1        glmnet_1.9-5       grid_3.0.2        
 [7] ipred_0.9-3        KernSmooth_2.23-10 lattice_0.20-23    lava_1.2.4         MASS_7.3-29        Matrix_1.1-2      
[13] maxent_1.3.3.1     nnet_7.3-7         parallel_3.0.2     prodlim_1.4.2      randomForest_4.6-7 Rcpp_0.10.6       
[19] rpart_4.1-5        slam_0.1-31        splines_3.0.2      survival_2.37-7    tau_0.0-16         tools_3.0.2       
[25] tree_1.0-34

有没有办法解决这个问题。

1 个答案:

答案 0 :(得分:1)

不是真正的答案,但由于篇幅较长sessionInfo()

而在此处发帖
library(RTextTools)
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)


attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RTextTools_1.4.1   tau_0.0-15         glmnet_1.9-5       Matrix_1.0-14      lattice_0.20-23    maxent_1.3.3      
 [7] Rcpp_0.10.5        caTools_1.14       ipred_0.9-2        e1071_1.6-1        class_7.3-9        tm_0.5-9.1        
[13] nnet_7.3-7         tree_1.0-34        randomForest_4.6-7 SparseM_1.03      

loaded via a namespace (and not attached):
 [1] bitops_1.0-6       grid_3.0.2         KernSmooth_2.23-10 MASS_7.3-29        parallel_3.0.2     prodlim_1.3.7     
 [7] rpart_4.1-3        slam_0.1-30        splines_3.0.2      survival_2.37-4    tools_3.0.2 

在我的情况下,所有必需的模块都加载在other attached packages下,而在您的情况下,它们加载在loaded via a namespace (and not attached)

在案例2下,R可以访问包但用户不能。有关详细说明,请参阅In R, what does "loaded via a namespace (and not attached)" mean?

我不知道为什么这些软件包没有附在您的案例中,但作为一种解决方法,您可以尝试这样做:

#grab list of package names required for RTextTools
# not_attached_list<-dput(names(sessionInfo()$otherPkgs))
#c("RTextTools", "tau", "glmnet", "Matrix", "lattice", "maxent", 
#"Rcpp", "caTools", "ipred", "e1071", "class", "tm", "nnet", "tree", 
#"randomForest", "SparseM")

not_attached_list<-c("RTextTools", "tau", "glmnet", "Matrix", "lattice", "maxent", 
"Rcpp", "caTools", "ipred", "e1071", "class", "tm", "nnet", "tree", 
"randomForest", "SparseM")

#Load the packages manually
sapply(not_loaded_list, require, character.only=TRUE)

#Check in sessionInfo if they have been attached now under 'other attached packages'
sessionInfo()

告诉我们这是否有效..