Zeppelin:运行多个段落

时间:2017-07-01 14:19:59

标签: javascript scala highcharts apache-zeppelin apache-spark-2.0

我正在尝试使用zeppelin绘制实时图表。为此我按照这个例子开发了spark-highcharts(spark 2.1.0,zeppelin 0.7)的结构化流数据框架:https://github.com/knockdata/spark-highcharts/blob/master/docs/StructureStreaming.md 但是当我运行流段落(定义图表) (图表段落)仍为“PENDING”,

我的(定义图表)段落代码是:

import com.knockdata.spark.highcharts._
import com.knockdata.spark.highcharts.model._
import org.apache.spark.sql.SparkSession
val spark = SparkSession
  .builder()
  .appName("Spark structured streaming Kafka example")
  .master("yarn")
  .getOrCreate()

val inputstream = spark.readStream
    .format("kafka")
    .option("kafka.bootstrap.servers", "n11.hdp.com:6667,n12.hdp.com:6667,n13.hdp.com:6667 ,n10.hdp.com:6667, n9.hdp.com:6667")
    .option("subscribe", "st")
    .load()
    spark.conf.set("spark.sql.streaming.checkpointLocation", "checkpoint")

val ValueString = inputstream.selectExpr("CAST( value AS STRING)").as[(String)]
.select(

                    expr("(split(value, ','))[1]").cast("string").as("GSM"),
                    expr("(split(value, ','))[7]").cast("double").as("Duration"),
                    expr("(split(value, ','))[10]").cast("double").as("DataUpLink1"),
                    expr("(split(value, ','))[11]").cast("double").as("DataDownLink1")
                    )
                    .filter("GSM is not null and  DataUpLink1 is not null and DataDownLink1 is not null and Duration is not null")
                    .groupBy("GSM").agg(sum("DataUpLink1") as "upload",sum("DataDownLink1")  as "download", sum("Duration") as "duration")
val query = highcharts(
ValueString.seriesCol("GSM")
.series("y" -> "download","x" -> "duration")
.orderBy(col("GSM")), z, "complete")
query.processAllAvailable()
query.awaitTermination()

我的(图表段落)代码是:

println("%angular")
StreamingChart(z)

0 个答案:

没有答案