我不清楚为什么会收到此错误。我用谷歌搜索但没有找到任何东西..我的代码有点冗长......
Sys.time()
mydata<-spark_read_csv(spark_cluster,name = "rd_1",columns = columns,path =
"MAF_MAY2017_APR2018.csv",header = F,delimiter = ",")
Sys.time()
KM1<-select(GH,device_subscriber_id,location_id,reapeated_user,NO_OF_VISITS,cumulative_visit_duration:saturday,Week_1:Week_5)
KM1<-distinct(KM1)
sdf_nrow(KM1)
set.seed(765675)
kmeans_model <- KM1 %>%
ml_kmeans(centers = 2,formula= ~reapeated_user+NO_OF_VISITS+cumulative_visit_duration+cumulative_average+NO_OF_DAYS_RECORDED+sunday+monday+tuesday+wednesday+thursday+friday+saturday+Week_1+Week_2+Week_3+Week_4+Week_5)
kmeans_model$summary
d3<-d3%>%
group_by(location_id)%>%
mutate(total_visits=n())%>%
group_by(location_id,first_seen_time)%>%
mutate(total_visits_prct=n()/total_visits)
d3<-select(d3,location_id,first_seen_time,total_visits_prct)
d3<-distinct(d3)
first_time_seen_d3&LT; - 收集(D3) 错误:java.lang.IllegalStateException:无法在已停止的SparkContext上调用方法。 这个停止的SparkContext创建于:
答案 0 :(得分:-1)
您需要在执行任何火花操作之前初始化spark。
Scala中有类似的东西
val sparkSession = SparkSession.builder
.enableHiveSupport()
.getOrCreate()
val sc = sparkSession.sparkContext
val sqlContext = sparkSession.sqlContext
你也可以在python中找到类似的东西。
在作业完成后,您应该sparkSession.stop()
如果你在spark shell上运行,那么你需要重启spark shell或者像上面提到的那样创建sparkSession
要在R中执行相同的操作,请按照wiki
进行操作