Question

试图弄清楚为什么在编码器上出现错误，任何见解都会有所帮助！

错误无法找到类型为SolrNewsDocument的编码器，需要隐式Encoder [SolrNewsDocument]来存储`

很明显，我已经导入了spark.implicits._。我还提供了一个编码器作为案例类。

def ingestDocsToSolr(newsItemDF: DataFrame) = {
  case class SolrNewsDocument(
                             title: String,
                             body: String,
                             publication: String,
                             date: String,
                             byline: String,
                             length: String
                           )
  import spark.implicits._
  val solrDocs = newsItemDF.as[SolrNewsDocument].map { doc =>
    val solrDoc = new SolrInputDocument
    solrDoc.setField("title", doc.title.toString)
    solrDoc.setField("body", doc.body)
    solrDoc.setField("publication", doc.publication)
    solrDoc.setField("date", doc.date)
    solrDoc.setField("byline", doc.byline)
    solrDoc.setField("length", doc.length)

    solrDoc
  }

  // can be used for stream SolrSupport.
  SolrSupport.indexDocs("localhost:2181", "collection", 10, solrDocs.rdd);
  val solrServer = SolrSupport.getCachedCloudClient("localhost:2181")
  solrServer.setDefaultCollection("collection")
  solrServer.commit(false, false)
}

Answer 1

//Check this one.-Move case class declaration before function declaration.
//Encoder is created once case class statement is executed by compiler. Then only compiler will be able to use encoder inside function deceleration.


import spark.implicits._

case class SolrNewsDocument(title: String,body: String,publication: String,date: String,byline: String,length: String)


def ingestDocsToSolr(newsItemDF:DataFrame) = {
val solrDocs = newsItemDF.as[SolrNewsDocument]}

Answer 2

我在尝试遍历文本文件时遇到此错误，就我而言，从spark 2.4.x开始，问题是我必须先将其转换为RDD（以前是隐式的）

textFile
  .rdd
  .flatMap(line=>line.split(" "))

Migrating our Scala codebase to Spark 2

尽管提供了火花，但Spark无法找到编码器（案例类）

2 个答案: