Question

当我尝试将数据框保存为pyspark中的Hive表时

df_writer.saveAsTable('hive_table', format='parquet', mode='overwrite')

我收到以下错误：

引起：org.apache.hadoop.mapred.InvalidInputException：输入路径不存在： hdfs：// hostname：8020 / apps / hive / warehouse / testdb.db / hive_table at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus（FileInputFormat.java:287）在 org.apache.hadoop.mapred.FileInputFormat.listStatus（FileInputFormat.java:229）

我有路径直到＆＃39; hdfs：// hostname：8020 / apps / hive / warehouse / testdb.db /＆＃39;

请提供您的意见

Answer 1

尝试将DataFrameWriter用作

df.write.mode(SaveMode.Append).insertInto(s"${dbName}.${t.table}")

无法将数据框保存为Hive表，抛出文件未找到异常

1 个答案: