在R中使用tm包和并行计算时遇到问题,我不确定我是在做傻事还是错误。
我创建了一个可重复的小例子:
# Load the libraries
library(tm)
library(snow)
# Create a Document Term Matrix
test_sentence = c("this is a test", "this is another test")
test_corpus = VCorpus(VectorSource(test_sentence))
test_TM = DocumentTermMatrix(test_corpus)
# Define a simple function that returns the matrix for the i-th document
test_function = function(i, TM){ TM[i, ] }
如果我使用这个例子运行一个简单的lapply,我会得到预期的没有任何问题:
# This returns the expected list containing the rows of the Matrix
res1 = lapply(1:2, test_function, test_TM)
但如果我并行运行,我会收到错误:
第一个错误:维数不正确
# This should return the same thing of the lapply above but instead it stops with an error
cl = makeCluster(2)
res2 = parLapply(cl, 1:2, test_function, test_TM)
stopCluster(cl)
答案 0 :(得分:2)
问题是不同的节点不会自动加载tm
包。但是,加载包是必要的,因为它定义了相关对象类的cl <- makeCluster(rep("localhost",2), type="SOCK")
clusterEvalQ(cl, library(tm))
clusterExport(cl, list=ls())
res <- parLapply(cl, as.list(1:2), test_function, test_TM)
stopCluster(cl)
方法。
以下代码执行以下操作:
class Foo
private
def bar
"bar"
end
magic_private_method_defined_test_method :bar #=> true
end
包class Foo
private
def bar
"bar"
end
respond_to? :bar #=> false
#this actually calls respond_to on the class, and so respond_to :superclass gives true
defined? :bar #=> nil
instance_methods.include?(:bar) #=> false
methods.include?(:bar) #=> false
method_defined?(:bar) #=> false
def bar
"redefined!"
end # redefining doesn't cause an error or anything
public
def bar
"redefined publicly!"
end #causes no error, behaves no differently whether or not #bar had been defined previously
end