Question

我正在尝试使用scala创建SQLContext下面是m段代码。

object SqltextContextSparkScala {
  def main(args: Array[String]) {
    System.setProperty("hadoop.home.dir", "C:\\hadoop-2.6.0")
    val conf = new SparkConf().setAppName("SampleSparkScalaApp").setMaster("local[2]").set("spark.executor.memory", "1g")

    val sc = new SparkContext(conf);
    val sqlContext = new SQLContext(sc);

    val readfile = sc.textFile("C:\\tmp\\people.txt")

    import sqlContext.implicits._

    val person = readfile.map(_.split(",")).map(p=> new Person(p(0), p(1), p(2)))
      sqlContext.to

  }

}

我在Person类上创建了：

class Person(id:String,name:String,age:String){

}

我如何在这里创建数据框：

val people = readfile.map(_.split(",")).map(p=> new Person(p(0), p(1), p(2))）

Answer 1

在：

val people=

添加声明：

import textContext.implicits._

之后：

val people

只做

val peopleDF = people.toDF()

你已经完成了。

Answer 2

找到解决方案..问题在于定义calss Person。

就像之前的情况一样：

class Person(id:String,name:String,age:String){

}

发现在声明之前需要案例，如下所示

case class Person(id:String,name:String,age:String){

}

但不确定此处案例的用途是什么。

使用scala从spark中的文本文件创建sqlContext

2 个答案: