为什么我无法使用spark-shell中的以下代码加载文件
import org.apache.spark.sql.types._
import org.apache.spark.sql.Encoder import org.apache.spark.sql.Encoders
import org.apache.spark.sql.expressions.Aggregator
case class Data(i: Int)
val customSummer = new Aggregator[Data, Int, Int] {
def zero: Int = 0
def reduce(b: Int, a: Data): Int = b + a.i
def merge(b1: Int, b2: Int): Int = b1 + b2
def finish(r: Int): Int = r
}.toColumn()
错误:
<console>:47: error: object creation impossible, since:
it has 2 unimplemented members.
/** As seen from <$anon: org.apache.spark.sql.expressions.Aggregator[Data,Int,Int]>, the missing signatures are as follows.
* For convenience, these are usable as stub implementations.
*/
def bufferEncoder: org.apache.spark.sql.Encoder[Int] = ???
def outputEncoder: org.apache.spark.sql.Encoder[Int] = ???
val customSummer = new Aggregator[Data, Int, Int] {
更新:@ user8371915的解决方案有效。但是以下脚本无法加载不同的错误。我在spark-shell中使用了:load script.sc
。
import org.apache.spark.sql.expressions.Aggregator
class MyClass extends Aggregator
错误:
loading ./script.sc...
import org.apache.spark.sql.expressions.Aggregator
<console>:11: error: not found: type Aggregator
class MyClass extends Aggregator
更新(2017-12-03):它似乎也不适用于Zeppelin。
答案 0 :(得分:1)
根据错误消息,您没有实施bufferEncoder
和outputEncoder
。请检查API docs以获取必须实施的abstract
方法列表。
这两个应该足够了:
def bufferEncoder: Encoder[Int] = Encoders.scalaInt
def outputEncoder: Encoder[Int] = Encoders.scalaInt