How to execute two queries in spark

时间:2017-10-12 10:10:20

标签: mysql apache-spark apache-spark-sql

I'm trying to set row number for each row in mysql via spark

SET @row_num = 0;
SELECT @row_num := @row_num + 1 as row_number,t.* FROM table t 

It's working fine in MySQL but from spark it's throwing an error

val setrownum = "SET property_key[=@row_num = 0]" // "SET @row_num = 0"
val rownum = "SELECT property_key := property_key + 1 as row_number,t.* FROM table t"
val setrowexe = spark.sql(setrownum)
val rownumexe = spark.sql(rownum)

I also tried

val setrownum = "SET property_key[=@row_num = 0];SELECT property_key := property_key + 1 as row_number,t.* FROM table t"
val setrowexe = spark.sql(setrownum) 
setrowexe.show()

but no luck. How to execute the above two queries in order to set table row number?

 SELECT property_key:=property_key+1 as rowid,d.* FROM destination d, (SELECT property_key[=@rowid:=0]) as init

This query executes in MySQL, but not via spark

1 个答案:

答案 0 :(得分:0)

  

上面的查询在mysql中执行,但不是通过spark

执行

它不会。 Spark SQL中不支持任何一种。行号可以使用带窗口函数的标准SQL获得。

SELECT row_number over (...) FROM table

外部数据库的类型不相关。在源服务器上执行查询的唯一方法是在数据框定义中使用子查询,但它不支持这样的扩展。