分区迭代器返回Nil Spark流

时间:2018-07-18 07:08:18

标签: scala apache-spark neo4j spark-streaming

我正在尝试使用toList方法将rdd中的数据收集到List [Array [String]]类型的批处理中。但是,当我启动应用程序时,出现这样的错误:

ClientException: Unable to convert scala.collection.immutable.Nil$ to Neo4j Value.

据我所知,迭代器返回Nil。这种行为的原因可能是什么?如何解决?

这是流数据处理的代码:

val messagesStream = streamingContext.union(lines)

    val values = messagesStream
      .map(record => record.value().toString)
    val wordsArrays = values.map(t => t.split(", "))

wordsArrays.foreachRDD(rdd => {
      val startTime1 = System.nanoTime()

      rdd.mapPartitions { partition => {
            val neo4jConfig = neo4jConfigurations.getNeo4jConfig(args(1))

            val res = partition.toList
            val recommendations = execBatchNeo4jSearchQuery(neo4jConfig, partition.toList)
            val calendarTime = Calendar.getInstance.getTime
            val dataMap = convertDataToMap(recommendations, calendarTime)

            dataMap.iterator
        }
      }.saveToEs("rdd-window/output")

      val endTime1 = System.nanoTime()
      System.out.println("Overall query + stream time: " + calculateElapsedTime(startTime1, endTime1))
    })

这是execBatchNeo4jSearchQuery方法定义:

def execBatchNeo4jSearchQuery(neo4jSession: Session, data: List[Array[String]]) = {
     val fieldsToRetrieve = neo4jQueries.matchQueryReturnResults

      val paramsList = Map("nodes" -> {data map { seq => Map(
        "lat" -> seq(1).toDouble.asInstanceOf[AnyRef],
        "lon" -> seq(2).toDouble.asInstanceOf[AnyRef],
        "id" -> seq(0).toInt.asInstanceOf[AnyRef]
      )}}.asInstanceOf[AnyRef])

      val queryResults = neo4jSession.run(neo4jQueries.searchQueryWithBatchParams, paramsList.asJava)

      val resultsList = queryResults
        .list()
        .asScala
        .map(toRow(_, fieldsToRetrieve))
        .toList

      resultsList
    }

0 个答案:

没有答案