Apache Spark数据集转换

时间:2019-01-22 18:13:18

标签: java apache-spark apache-spark-sql

我的目的是合并2张桌子。 如何在Java中做到这一点?

使用此代码时出现错误。

public class App {
    public static void main(String[] args) {
        System.setProperty("hadoop.home.dir", "C:\\hadoop-common-2.2.0-bin-master");

        SparkSession sparkSession = SparkSession.builder().appName("SQL").master("local").getOrCreate();
        final Properties cp = new Properties();
        cp.put("user", "root");
        cp.put("password", "1234");

        Dataset<Row> studentData = sparkSession.read().jdbc("jdbc:mysql://localhost:3306/dd", "student", cp);
        Dataset<Row> schoolData = sparkSession.read().jdbc("jdbc:mysql://localhost:3306/dd", "school", cp);

        Dataset<Ogrenci> studentDS = studentData.as(Encoders.bean(Ogrenci.class));
        Dataset<Okul> schoolDS = schoolData.as(Encoders.bean(Okul.class));


        Dataset<Row> resultDS = studentDS.joinWith(schoolDS, studentData.col("schoolId") == schoolDS.col("id")).drop("schoolId"); ??

        resultDS.show();

    }
}

0 个答案:

没有答案