从https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-joins.html#joinWith
尝试此示例时 case class Person(id: Long, name: String, cityId: Long)
case class City(id: Long, name: String)
val people = Seq(Person(0, "Agata", 0), Person(1, "Iweta", 0)).toDS
val cities = Seq(City(0, "Warsaw"), City(1, "Washington")).toDS
val joined = people.joinWith(cities, people("cityId") === cities("id"))
joined.show()
我收到此错误
Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 21, Column 35: Incompatible expression types "boolean" and "java.lang.Boolean"
帮助我。提前谢谢。
答案 0 :(得分:0)
我尝试使用Spark版本1.6.0,Scala版本2.10.5(Java HotSpot(TM)64位服务器VM,Java 1.7.0_67)并获得
scala> val joined = people.joinWith(cities, people("cityId") === cities("id"))
<console>:33: error: org.apache.spark.sql.Dataset[Person] does not take parameters
val joined = people.joinWith(cities, people("cityId") === cities("id"))
^
相反,您可以尝试
people.as("p").joinWith(cities.as("c"), $"p.cityId" === $"c.id").show
或
people.joinWith(cities, people.toDF()("cityId") === cities.toDF()("id")).show
或
val peopleDF = Seq(Person(0, "Agata", 0), Person(1, "Iweta", 0)).toDF
val citiesDF = Seq(City(0, "Warsaw"), City(1, "Washington")).toDF
peopleDF.join(citiesDF, peopleDF("cityId") === citiesDF("id")).show