如何在hive中运行r脚本

时间:2015-03-09 14:01:03

标签: r hive

我在蜂巢中有两张桌子:一张是测试,另一张是火车。我写了一些R代码来从hive中获取表格。

这是R代码:

#loading library
library(RHive)

library(rhdfs)

library(rmr2)

library(gplots)

library(gtable)

library(gtools)

library(caTools)

library(RMySQL)

library(devtools)

rhive.connect(host="xxx.xxx.x.xxx",port=xxxxx, hiveServer2=FALSE, defaultFS=NULL,
              updateJar=FALSE, user=NULL, password=NULL)


train<-rhive.query("select * from train")
test<-rhive.query("select * from test")
trainglm <- glm(leadconverted~.,data=train)
p1<-predict.glm(trainglm,newdata=test,type="response") 
lcp<-as.matrix(p1[])
colnames(lcp)<-"LeadConverted"
test1<-cbind(test,lcp)

test2<-test1
colnames(test2)<-NULL
write.csv(test2,file="/home/dsri/Downloads/test1",quote=F,row.names=F)
rhive.query("drop table test")
rhive.query("CREATE TABLE test(Std_DistanctToVendor float,Std_Income float,Std_ZipPopulationDensity float,FirstLastPropCase INT,NameEmailCheck INT,SingleWeekday STRING,lead_TimeFrameCont STRING,
            ,Vehicle_FinanceMethod STRING,AddressProvided INT,Hybrid INT,LeadConverted float)
            ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE")
rhive.query("LOAD DATA local INPATH '/home/dsri/Downloads/test1' OVERWRITE INTO TABLE test")

我将此R代码保存为/home/dsri/Downloads/myscript.r

我必须从hive运行此代码。

我没有得到如何开始以及如何继续。

0 个答案:

没有答案