Question

我有多行查询的代码

  val hiveInsertIntoTable = spark.read.text(fileQuery).collect()
  hiveInsertIntoTable.foreach(println)

  val actualQuery = hiveInsertIntoTable(0).mkString
  println(actualQuery)


  spark.sql(s"truncate table $tableTruncate")
  spark.sql(actualQuery)

每当我尝试执行实际查询时，我都会收到错误。

org.apache.spark.sql.catalyst.parser.ParseException:
no viable alternative at input '<EOF>'(line 1, pos 52)
== SQL ==
insert into wera_tacotv_esd.lac_asset_table_pb_hive

----------------------------------------------- ----- ^^^

and the end of the query  .... ;    (terminates in a ;)

查询实际上约为450行

我试图将变量包装在三引号中，但这也不起作用。

感谢任何帮助。

我正在使用spark 2.1和scala 2.11

Answer 1

三个问题：

hiveInsertIntoTable是Array[org.apache.spark.sql.Row] - 不是非常有用的结构。
您只需要第一行hiveInsertIntoTable(0)
即使您占用了所有行，与空字符串（.mkString）连接也不会有效。

或者：

val actualQuery = spark.read.text(path).as[String].collect.mkString("\n")

或

val actualQuery = spark.sparkContext.wholeTextFiles(path).values.first()

如何在存储为字符串变量时执行spark sql多行查询？

1 个答案: