使用spark streaming和scala获取twitter家庭时间线推文

时间:2017-04-17 12:22:14

标签: scala apache-spark twitter spark-streaming twitter4j

如何使用spark streaming和scala获取Twitter主页时间线推文?

val ssc = new StreamingContext(sc, Seconds(1))
val output = TwitterUtils.createStream(ssc, None)

当我使用createStream时,它不会返回我的时间轴。

1 个答案:

答案 0 :(得分:0)

为了使用Spark Streaming和Scala获取家庭时间线推文,我们需要使用TwitterFactory中的twitter4j设置Twitter OAuth凭据。

import twitter4j.TwitterFactory
import twitter4j.auth.AccessToken

// Twitter Authentication credentials
  val consumerKey = "twitter_consumer_key"
  val consumerSecret = "twitter_consumer_secret"
  val accessToken = "twitter_access_token"
  val accessTokenSecret = "twitter_access_token_secret"

  // Authorizing with your Twitter Application credentials
  val twitter = new TwitterFactory().getInstance()
  twitter.setOAuthConsumer(consumerKey, consumerSecret)
  twitter.setOAuthAccessToken(new AccessToken(accessToken, accessTokenSecret))

  // Setting up streaming context with a window of 10 seconds
  val ssc = new StreamingContext(sc, Seconds(1))
  val output = TwitterUtils.createStream(ssc, Option(twitter.getAuthorization()))

或者,如果您不想在ssc中设置访问凭据,则可以使用以下代码:

System.setProperty("twitter4j.oauth.consumerKey", "twitter_consumer_key")
System.setProperty("twitter4j.oauth.consumerSecret", "twitter_consumer_secret")
System.setProperty("twitter4j.oauth.accessToken", "twitter_access_token")
System.setProperty("twitter4j.oauth.accessTokenSecret", "twitter_access_token_secret")


val ssc = new StreamingContext(sc, Seconds(1))
val output = TwitterUtils.createStream(ssc, None)

有关Spark Streaming和Scala的完整示例,请参阅以下示例:knolx-spark-streaming