当我尝试这段R代码时。我有并行问题
ShowDialog
我有错误通知:
# include library
require(stats)
library(GMD)
library(parallel)
# include function
source('~/Workspaces/Projects/RProject/MovielensCluster/readData.R'); # contain readtext.convert() function
###
elbow.k <- function(mydata){
## determine a "good" k using elbow
dist.obj <- dist(mydata);
hclust.obj <- hclust(dist.obj);
css.obj <- css.hclust(dist.obj,hclust.obj);
elbow.obj <- elbow.batch(css.obj);
# print(elbow.obj)
k <- elbow.obj$k
return(k)
}
# include file
filePath <- "dataset/u.user";
data.convert <- readtext.convert(filePath);
data.clustering <- data.convert[,c(-1,-4)];
# find k value
no_cores <- detectCores();
cl<-makeCluster(no_cores);
clusterExport(cl, list("data.clustering", "data.original", "elbow.k", "clustering.kmeans"));
start.time <- Sys.time();
k.clusters <- parSapply(cl, 1, function(x) elbow.k(data.clustering));
end.time <- Sys.time();
cat('Time to find k using Elbow method is',(end.time - start.time),'seconds with k value:', k.clusters);
任何人都可以帮我解决吗?非常感谢。
答案 0 :(得分:3)
我认为你的问题与“变量范围”有关。在Mac / Linux上,您可以选择使用 makeCluster(no_core,type =“FORK”),它自动包含所有环境变量。在Windows上,您必须使用并行套接字群集(PSOCK),该群集仅从加载的基础软件包开始。因此,您总是准确指定哪些变量以及您包含的并行函数库。 clusterExport()和 clusterEvalQ()是必需的,以便函数分别查看所需的变量和包。请注意,忽略 clusterExport 之后对变量的任何更改。回到你的问题。您必须使用以下内容:
clusterEvalQ(cl, library(GMD));
和您的完整代码:
# include library
require(stats)
library(GMD)
library(parallel)
# include function
source('~/Workspaces/Projects/RProject/MovielensCluster/readData.R'); # contain readtext.convert() function
###
elbow.k <- function(mydata){
## determine a "good" k using elbow
dist.obj <- dist(mydata);
hclust.obj <- hclust(dist.obj);
css.obj <- css.hclust(dist.obj,hclust.obj);
elbow.obj <- elbow.batch(css.obj);
# print(elbow.obj)
k <- elbow.obj$k
return(k)
}
# include file
filePath <- "dataset/u.user";
data.convert <- readtext.convert(filePath);
data.clustering <- data.convert[,c(-1,-4)];
# find k value
no_cores <- detectCores();
cl<-makeCluster(no_cores);
clusterEvalQ(cl, library(GMD));
clusterExport(cl, list("data.clustering", "data.original", "elbow.k", "clustering.kmeans"));
start.time <- Sys.time();
k.clusters <- parSapply(cl, 1, function(x) elbow.k(data.clustering));
end.time <- Sys.time();
cat('Time to find k using Elbow method is',(end.time - start.time),'seconds with k value:', k.clusters);