Apache Zeppelin 0.6.1:运行Spark 2.0 Twitter Stream App

时间:2016-09-12 10:03:31

标签: scala apache-spark spark-streaming twitter4j apache-zeppelin

我有一个安装了Spark 2.0和Zeppelin 0.6.1的集群。由于类TwitterUtils.scala已从Spark项目移至Apache Bahir,因此我无法在Zeppelin笔记本中使用TwitterUtils。

这是我笔记本的片段:

依赖加载:

%dep
z.reset
z.load("org.apache.bahir:spark-streaming-twitter_2.11:2.0.0")

DepInterpreter(%dep) deprecated. Remove dependencies and repositories through GUI interpreter menu instead.
DepInterpreter(%dep) deprecated. Load dependency through GUI interpreter menu instead.
res1: org.apache.zeppelin.dep.Dependency = org.apache.zeppelin.dep.Dependency@4793109a

Spark部分:

import org.apache.spark.streaming.twitter
import org.apache.spark.streaming._
import org.apache.spark.storage.StorageLevel
import scala.io.Source
import scala.collection.mutable.HashMap
import java.io.File
import org.apache.log4j.Logger
import org.apache.log4j.Level
import sys.process.stringSeqToProcess
import org.apache.spark.SparkConf

// ********************************* Configures the Oauth Credentials for accessing Twitter ****************************
def configureTwitterCredentials(apiKey: String, apiSecret: String, accessToken: String, accessTokenSecret: String) {...}

// ***************************************** Configure Twitter credentials ********************************************
val apiKey = ...
val apiSecret = ...
val accessToken = ...
val accessTokenSecret = ...
configureTwitterCredentials(apiKey, apiSecret, accessToken, accessTokenSecret)

//  ************************************************* The logic itself *************************************************
val ssc = new StreamingContext(sc, Seconds(2))
val tweets = TwitterUtils.createStream(ssc, None)
val twt = tweets.window(Seconds(60))

当我尝试在导入依赖项后运行笔记本的Spark部分时,我得到以下异常:

<console>:44: error: object twitter is not a member of package org.apache.spark.streaming
   import org.apache.spark.streaming.twitter

我在这里做错了什么? Bahir文档也使用import org.apache.spark.streaming.twitter._命令,请参阅http://bahir.apache.org/docs/spark/2.0.0/spark-streaming-twitter/

1 个答案:

答案 0 :(得分:1)

好吧,dep并不完全稳定,因为它已被弃用,为什么不使用支持的方法呢?如果您不想修改Spark和Zeppelin配置文件,可以在解释器配置中添加依赖项(为清晰起见,我省略了属性):

enter image description here