spark

时间:2017-08-23 05:02:43

标签: java scala apache-spark boolean

https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-joins.html#joinWith

尝试此示例时
    case class Person(id: Long, name: String, cityId: Long)
    case class City(id: Long, name: String)

    val people = Seq(Person(0, "Agata", 0), Person(1, "Iweta", 0)).toDS
    val cities = Seq(City(0, "Warsaw"), City(1, "Washington")).toDS

    val joined = people.joinWith(cities, people("cityId") === cities("id"))
    joined.show()

我收到此错误

Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 21, Column 35: Incompatible expression types "boolean" and "java.lang.Boolean"

帮助我。提前谢谢。

1 个答案:

答案 0 :(得分:0)

我尝试使用Spark版本1.6.0,Scala版本2.10.5(Java HotSpot(TM)64位服务器VM,Java 1.7.0_67)并获得

scala> val joined = people.joinWith(cities, people("cityId") === cities("id"))
<console>:33: error: org.apache.spark.sql.Dataset[Person] does not take parameters
         val joined = people.joinWith(cities, people("cityId") === cities("id"))
                                                    ^

相反,您可以尝试

people.as("p").joinWith(cities.as("c"), $"p.cityId" === $"c.id").show

people.joinWith(cities, people.toDF()("cityId") === cities.toDF()("id")).show

val peopleDF = Seq(Person(0, "Agata", 0), Person(1, "Iweta", 0)).toDF
val citiesDF = Seq(City(0, "Warsaw"), City(1, "Washington")).toDF
peopleDF.join(citiesDF, peopleDF("cityId") === citiesDF("id")).show