使用spark和kafka时使用zeppelin z.put函数时出错

时间:2016-08-12 20:10:11

标签: angularjs apache-spark apache-kafka pyspark spark-streaming

我正在尝试使用以下代码在zeppelin笔记本中使用spark,kafka。

    %pyspark

import pyspark
from pyspark.streaming.kafka import KafkaUtils
from pyspark.streaming import StreamingContext

ssc = StreamingContext(sc,60)
broker = "127.0.0.1:9092"
directKafkaStream = KafkaUtils.createDirectStream(ssc, ["testlogs"],{"metadata.broker.list": broker})
lines = directKafkaStream.map(lambda x: x[1])


f1 = lines.filter(lambda x: x[2])
f2 = f1.map(lambda x: (x.split(",")[4]))
f3 = f2.map(lambda x: (x.split(":")[1]))
f4 = f3.map(lambda x: x)
f5 = f4.filter(lambda x: x == "test string")
f6 = f5.count()
f6.pprint() 
#Above prints correct count on the console


z.put('m0_count', str(f6))

ssc.start()
ssc.awaitTerminationOrTimeout(300)


%spark

z.angularBind("m0_count", z.get("m0_count"))


%angular

<html>
<h2>Table</h2>
    <hr />
    <div class="row">
        <div class="col-md-6"><center><h3>my count</h3>{{m0_count}}</center></div>
    </div>
    <br />
</html>

我在angularjs表中出现以下错误:
      &#34; pyspark.streaming.dstream.TransformedDStream对象位于0x7fdc797c3b90&#34;

有谁能告诉我如何解决这个问题?

0 个答案:

没有答案