我有以下在REPL中运行良好的函数,基本上它正在做的是检查模式的数据类型,并在稍后将文件展平并与zipWithIndex匹配时将其与列匹配:
//Match a Schema to a Column value
def schemaMatch(x: Array[String]) = {
var accum = 0
for(i <- 0 until x.length) {
val convert = x(i).toString.toUpperCase
println(convert)
val split = convert.split(' ')
println(split.mkString(" "))
matchTest(split(1), accum)
accum += 1
}
def matchTest(y:String, z:Int) = y match{
case "STRING" => strBuf += z
case "INTEGER" => decimalBuf += z
case "DECIMAL" => decimalBuf += z
case "DATE" => dateBuf += z
}
}
schemaMatch(schema1)
我得到的错误:
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.IntRef.create(I)Lscala/runtime/IntRef;
at com.capitalone.DTS.dataProfiling$.schemaMatch$1(dataProfiling.scala:112)
at com.capitalone.DTS.dataProfiling$.main(dataProfiling.scala:131)
at com.capitalone.DTS.dataProfiling.main(dataProfiling.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
第112行:
var accum = 0
任何想法为什么它在编译后不再工作但在REPL中工作,以及如何纠正它?
答案 0 :(得分:1)
使用您提供给我们的代码很难看到导致NoSuchMethodError
的潜在问题,但您可以简化提取列类型和相应索引的方法:
def schemaMatch(schema: Array[String]) : Map[String,List[Int]] =
schema
// get the 2nd word (column type) in upper cases
.map(columnDescr => columnDescr.split(' ')(1).toUpperCase)
// combine column type with index
.zipWithIndex
// group by column type
.groupBy{ case (colType, index) => colType }
// keep only the indices
.mapValues( columsIndices => columsIndices.map(_._2).toList )
可以用作:
val columns = Array("x string", "1 integer", "2 decimal", "2015 date")
val columnTypeMap = schemaMatch(columns)
//Map(DATE -> List(3), STRING -> List(0), DECIMAL -> List(2), INTEGER -> List(1))
val strIndices = columnTypeMap.getOrElse("STRING", Nil)
// List(0)
val decimalIndices = columnTypeMap.getOrElse("INTEGER", Nil) :::
columnTypeMap.getOrElse("DECIMAL", Nil)
// List(1, 2)