即使在其他情况下也可以在redis中获得重复值...否则需要在SparkStreaming中过滤

时间:2019-01-28 09:06:10

标签: apache-spark redis spark-streaming

当我在Spark流处理过程中使用Redis时,foreachRDD中的if ... else块似乎无法正常运行。不知道为什么。它与Spark RDD有关系吗?希望能有所帮助!

下面火花栏中的部分代码。为了连续计算火车的速度。我使用Redis保存了最近的两批坐标和时间。由于按时间顺序排列的一批数据源可能具有相同的时间戳,因此我添加了一个if ... else块,以避免保存两个相同的时间戳。而且,Redis中的结果仍然具有相同的“时间”,而其他值(如Altitude1)等于Altitude2。

val message = KafkaUtils.createDirectStream[String, String](
      ssc,
      LocationStrategies.PreferConsistent,
      ConsumerStrategies.Subscribe[String, String](Constants.inputKafkaTopics, GetKafka.getKafkaConfig())
    )

    message.foreachRDD{rdd=>if(!rdd.isEmpty()) {
    val signal = rdd.map { r =>
      var alarmSignal: AlarmSignal = null
      try {
        val signal: JSONObject = JSON.parseObject(r.value())
        alarmSignal = AlarmSignal.parseFromJson(signal)
      } catch {
        case e: Exception => {
          e.printStackTrace()
        }
      }
      alarmSignal
    }.cache()
    val result = signal.map{r=>    
    if(null != r.getLatitude && null != r.getLongitude && null != r.getAltitude){
            val coordinate : String = "coordinateByCode:"+r.getCode
            if(!jedis.exists(coordinate)){
              jedis.hset(coordinate,"Altitude1",r.getAltitude)
              jedis.hset(coordinate,"Longitude1",r.getLongitude)
              jedis.hset(coordinate,"Latitude1",r.getLatitude)
              jedis.hset(coordinate,"Time1",r.getTime)
              jedis.hset(coordinate,"Altitude2",r.getAltitude)
              jedis.hset(coordinate,"Longitude2",r.getLongitude)
              jedis.hset(coordinate,"Latitude2",r.getLatitude)
              jedis.hset(coordinate,"Time2",r.getTime)
            }else if(jedis.exists(coordinate)){
              if(!jedis.hget(coordinate,"Time2").equals(r.getTime)){
                jedis.hset(coordinate,"Altitude1",jedis.hget(coordinate,"Altitude2"))
                jedis.hset(coordinate,"Longitude1",jedis.hget(coordinate,"Longitude2"))
                jedis.hset(coordinate,"Latitude1",jedis.hget(coordinate,"Latitude2"))
                jedis.hset(coordinate,"Time1",jedis.hget(coordinate,"Time2"))
                jedis.hset(coordinate,"Altitude2",r.getAltitude)
                jedis.hset(coordinate,"Longitude2",r.getLongitude)
                jedis.hset(coordinate,"Latitude2",r.getLatitude)
                jedis.hset(coordinate,"Time2",r.getTime)
              }
            }
        }
}

同样的代码将正确的值保存在Java测试代码中的Redis中(没有重复的值)。

0 个答案:

没有答案