无法将库data.table加载到R parallel

时间:2017-08-24 09:47:11

标签: r parallel-processing data.table

当我使用下面的代码将库data.table加载到cpu集群时,

R抛出一个错误。但是data.table包安装在R上,并且在并行代码之外使用时工作正常。

#include <iostream>
#include <vector>
#include <map>

class Base
{
  private:
    std::map<int, std::string> m_base1, m_base2;
    std::vector<std::string> m_str1 = {"one", "two", "three"};
    std::vector<std::string> m_str2 = {"four", "five", "six"};

  public:
    std::vector<std::string> &str1() { return m_str1; }
    std::vector<std::string> &str2() { return m_str2; }

    std::map<int, std::string> &base1() { return m_base1; }
    std::map<int, std::string> &base2() { return m_base2; }
};

template <typename T>
void fill_vec(T *b)
{
    size_t counter = 0;
    for (const auto &str_iter : b->str1())
        (b->base1())[counter++] = str_iter;

    counter=0;
    for (const auto &str_iter : b->str2())
        (b->base2())[counter++] = str_iter;
}

int main(int argc, char *argv[])
{
    Base *b = new Base;
    fill_vec(b);

    return 0;
}

错误: -

  

clusterEvalQ(cl,library(data.table))   checkForRemoteErrors出错(lapply(cl,recvResult)):     3个节点产生错误;第一个错误:没有名为&#39; data.table&#39;

的包

1 个答案:

答案 0 :(得分:0)

基于HenrikB在评论中所说的内容,我通过将我的.libPaths()调用添加到clusterEvalQ()来解决这个问题:

.libPaths("C:/programs/rlib")
library(parallel)
no_cores<-detectCores()-1

cl<-makeCluster(no_cores)
#this is needed to see the package
clusterEvalQ(cl, .libPaths("C:/programs/rlib"))

# I'm using a function that uses the stringdist library
clusterEvalQ(cl, library(stringdist))

#You need to load your data into the cluster also
clusterExport(cl, "unmatched")
clusterExport(cl, "matched")

#now we're going to run it, amatch is a function in the stringdist lib

parLapply(cl, unmatched,function(x) amatch(x,matched, maxDist = Inf))