Spark流中的空RDD批次

时间:2019-07-14 22:33:59

标签: json apache-spark rabbitmq streaming

我已经用scala实现了Spark Streaming,它从远程RabbitMq接收返回Json字符串记录的数据。 SPARK总是返回空的RDD。...

  • 我在RabbitMQ中创建了队列(耐用)

  • 通过Java App发送Json文件,该文件可迭代文件行并通过“ channel.basicPublish()”函数发送字符串

  • 在Scala中:

       val sparkConfig = new SparkConf()
              .setAppName("AppName")
              .setIfMissing("spark.master", "local[*]")
              .set("spark.driver.allowMultipleContexts","true")
              .set("spark.dynamicAllocation.enabled","false")
    
        val ssc =  new StreamingContext(sparkConfig, Seconds(3))
        val sc = SparkContext
    
       val receiverStream = RabbitMQUtils.createStream(ssc, Map(
           "hosts" -> host,
           "queueName" -> queueName,
           "userName" -> userName,
           "password" -> password
          ))
    
    receiverStream.start()
    
     receiverStream.foreachRDD(rdd => {
      val spark = SparkSession.builder.config(rdd.sparkContext.getConf)
        .getOrCreate()
    
     import spark.implicits._
    
      if (!rdd.isEmpty()) {
        val Count = rdd.count()
        println("count is " + Count)
          }
       }
    

代码成功与RabbitMQ连接,但收到空的RDD吗?

0 个答案:

没有答案