Question

为什么我无法使用spark-shell中的以下代码加载文件

import org.apache.spark.sql.types._                                                                                                                                                                                                                                             

import org.apache.spark.sql.Encoder                                                                                                                                                                                                                                             import org.apache.spark.sql.Encoders                                                                                                                                                                                                                                            
import org.apache.spark.sql.expressions.Aggregator                                                                                                                                                                                                                              


case class Data(i: Int)                                                                                                                                                                                                                                                         

val customSummer =  new Aggregator[Data, Int, Int] {                                                                                                                                                                                                                            
  def zero: Int = 0                                                                                                                                                                                                                                                             
  def reduce(b: Int, a: Data): Int = b + a.i                                                                                                                                                                                                                                    
  def merge(b1: Int, b2: Int): Int = b1 + b2                                                                                                                                                                                                                                    
  def finish(r: Int): Int = r                                                                                                                                                                                                                                                   
}.toColumn()

错误：

<console>:47: error: object creation impossible, since:
it has 2 unimplemented members.
/** As seen from <$anon: org.apache.spark.sql.expressions.Aggregator[Data,Int,Int]>, the missing signatures are as follows.
 *  For convenience, these are usable as stub implementations.
 */
  def bufferEncoder: org.apache.spark.sql.Encoder[Int] = ???
  def outputEncoder: org.apache.spark.sql.Encoder[Int] = ???

       val customSummer =  new Aggregator[Data, Int, Int] {

更新：@ user8371915的解决方案有效。但是以下脚本无法加载不同的错误。我在spark-shell中使用了:load script.sc。

import org.apache.spark.sql.expressions.Aggregator
class MyClass extends Aggregator

错误：

loading ./script.sc...
import org.apache.spark.sql.expressions.Aggregator
<console>:11: error: not found: type Aggregator
       class MyClass extends Aggregator

更新（2017-12-03）：它似乎也不适用于Zeppelin。

Answer 1

根据错误消息，您没有实施bufferEncoder和outputEncoder。请检查API docs以获取必须实施的abstract方法列表。

这两个应该足够了：

def bufferEncoder: Encoder[Int] = Encoders.scalaInt
def outputEncoder: Encoder[Int] = Encoders.scalaInt

spark-shell找不到要扩展的类

1 个答案: