Question

spark版本：2.1.0

我想插入带有'dt'字段分区的Datasetinto hive，但它失败了。

使用'insertInto（）'时，错误是：'spark2.0 insertInto（）不能与partitionBy（）一起使用'

使用'saveAsTale（）'时，错误为：'在Hive serde表ad中保存数据。ad_industry_user_profile_incr尚不支持。请使用insertInto（）API作为替代。'

而且，核心代码如下：

        rowRDD.foreachRDD(new VoidFunction<JavaRDD<Row>>() {
            @Override
            public void call(JavaRDD<Row> rowJavaRDD) throws Exception {
                Dataset<Row> profileDataFrame = hc.createDataFrame(rowJavaRDD, schema).coalesce(1);
                profileDataFrame.write().partitionBy("dt").mode(SaveMode.Append).insertInto(tableName);
//                profileDataFrame.write().partitionBy("dt").mode(SaveMode.Append).saveAsTable(tableName);
            }
        });

请帮助我〜

Answer 1

使用profileDataFrame.write（）。mode（SaveMode.Append）.insertInto（tableName）没有.partitionBy（“dt”）

spark2.1.0将数据插入到hive错误中

1 个答案: