我正在尝试从Spark向Cassandra DB写一个Seq [(String,Double)]数据,例如Seq((" re",1.0),(" im" ,2.0))到卡桑德拉。但是有一个例外情况如下:
Exception in thread "main" scala.ScalaReflectionException: <none> is not a term
at scala.reflect.api.Symbols$SymbolApi$class.asTerm(Symbols.scala:199)
at scala.reflect.internal.Symbols$SymbolContextApiImpl.asTerm(Symbols.scala:84)
.....
Spark代码如下:
def main(args: Array[String]) {
// omit some code
val rawTRLStream = KafkaUtils.createDirectStream[String, Array[Byte], StringDecoder, DefaultDecoder](ssc, kafkaParams, topics)
val parsedTRLStream = rawTRLStream.map {
case (_, inputStreamData) =>
//--- do somthing next
//....
val seq : Seq[(String, Double)]= Seq (("re", 1.0), ("im", 2.0))
seq
}
implicit val rowWriter = SqlRowWriter.Factory // This is a suggestion on the web, but it does not help on this problem.
parsedTRLStream.saveToCassandra("simple_avro_data", "simple_avro_data")
//Kick off
ssc.start()
ssc.awaitTermination()
ssc.stop()
}
Cassandra架构如下:
CREATE TABLE simple_avro_data (
re double,
im double,
PRIMARY KEY ((re), im)
) WITH CLUSTERING ORDER BY (im DESC);
我还尝试了scala.ScalaReflectionException: <none> is not a term
的下一个建议val seq = (("re", 1.0), ("im", 2.0))
这删除了异常&#34; ....不是一个术语&#34;,但它引入了另一个例外:
Com.datastax.spark.connector.types.TypeConversionException: Cannot convert object (re,1.0) of type class scala.Tuple2 to java.lang.Double.
有谁知道如何解决这个问题?
谢谢,
答案 0 :(得分:0)
如果期望值缺失或为空,请确保设置默认值。
例如: 如果我们使用SomeColumns,并且先看能正常工作的代码,然后再看总是会在输入数据不正确的情况下抛出异常,那么我们可以更好地看到该问题。
以下代码通过为请求设置数据或3列错误代码来安全地工作。
val lines = ssc.socketTextStream(host , port)
// create requests from socket stream
val requests = lines.map(x => {
val matcher:Matcher = pattern.matcher(x) // using a regex matcher on the values
if (matcher.matches()) { // have matches
val ip = matcher.group(1)
val request = matcher.group(5)
val status = matcher.group(6).toInt
(ip, request, status) // create the request
} else {
("error", "error", 0) // no matches then create an error requests
}
})
requests.foreachRDD((rdd, time) => {
rdd.cache()
rdd.saveToCassandra(keyspace, table, SomeColumns("ip", "request", "status")) // save what we put into the request
})
当我在请求,数据库和saveToCassandra方法中添加一列,但没有在else中添加列...
当我的流中的数据不是我期望的并且创建了else请求时,我有一个例外:“ scala.ScalaReflectionException:不是一个术语”
val lines = ssc.socketTextStream(host , port)
// create requests from socket stream
val requests = lines.map(x => {
val matcher:Matcher = pattern.matcher(x) // using a regex matcher on the values
if (matcher.matches()) { // have matches
val ip = matcher.group(1)
val request = matcher.group(5)
val status = matcher.group(6).toInt
(ip, request, status, agent) // create the request + ADDITIONAL agent column
} else {
("error", "error", 0) // Will get an exception if this is created without 4 columns
}
})
requests.foreachRDD((rdd, time) => {
rdd.cache()
rdd.saveToCassandra(keyspace, table, SomeColumns("ip", "request", "status", "agent")) // save what we put into the request
})
我需要为新的列代理添加默认值,以确保始终发送4列的数据
else {
("error", "error", 0 , "error")
确保在映射流时始终填充parsedTRLStream。 rdd会抱怨如果试图从一无所有中保存某些东西。
我希望这有助于更好地解释该异常。