使用RHadoop进行线性回归(Mapreduce)

时间:2014-07-06 06:12:27

标签: r regression rhadoop

我是RHadoop和RMR的新手......我需要在R Mapreduce中编写Mapreduce作业。我试过写,但在执行时,它会出错。我正在尝试从hdfs读取文件。

我知道如何在R中执行此操作:output <- lm(cnt~temp+hum,data)
我试图实现下面的代码,但它抛出了这个错误......

错误:

Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  : 
   hadoop streaming failed with error code 1

代码:

input = "/hdfs/bikes_LR/day.csv",
     map=
       function(.,Xi){
         yi =c[Xi[,1],]
         Xi = Xi[,-1]
        keyval(1,list(t(Xi)%*%yi))
       },
     reduce = TRUE ,
    combine = TRUE)))[[1]]
   solve(XtX,XtY)

输入:

instant,dteday,season,yr,mnth,holiday,weekday,workingday,weathersit,temp,atemp,hum,windspeed,casual,registered,cnt
1,2011-01-01,1,0,1,0,6,0,2,0.344167,0.363625,0.805833,0.160446,331,654,985
2,2011-01-02,1,0,1,0,0,0,2,0.363478,0.353739,0.696087,0.248539,131,670,801
3,2011-01-03,1,0,1,0,1,1,1,0.196364,0.189405,0.437273,0.248309,120,1229,1349
4,2011-01-04,1,0,1,0,2,1,1,0.2,0.212122,0.590435,0.160296,108,1454,1562
5,2011-01-05,1,0,1,0,3,1,1,0.226957,0.22927,0.436957,0.1869,82,1518,1600

0 个答案:

没有答案