Spark SQL - 转义查询字符串

时间:2015-08-12 15:07:41

标签: sql scala apache-spark apache-spark-sql

我无法相信我在问这个但是......

你如何使用SCALA逃避SPARK SQL中的SQL QUERY STRING?

我已经厌倦了一切,到处搜寻。我以为apache commons库会这样做,但没有运气:

import org.apache.commons.lang.StringEscapeUtils

var sql = StringEscapeUtils.escapeSql("'Ulmus_minor_'Toledo'");

df.filter("topic = '" + sql + "'").map(_.getValuesMap[Any](List("hits","date"))).collect().foreach(println);

返回以下内容:

  

topic =''' Ulmus_minor _'' Toledo'''                   ^ at org.apache.spark.sql.catalyst.SqlParser.parseExpression(SqlParser.scala:45)中的scala.sys.package $ .error(package.scala:27)     在org.apache.spark.sql.DataFrame.filter(DataFrame.scala:651)at at   $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:29)at   $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:34)at   $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:36)at   $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:38)at at   $ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:40)at   $ iwC $$ iwC $$ iwC $$ iwC。(:42)at   $ iwC $$ iwC $$ iwC。(:44)at $ iwC $$ iwC。(:46)     at $ iwC。(:48)at(:50)at   (:54)at。()at   。(:7)at。()at $ print()     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.lang.reflect.Method.invoke(Method.java:497)at   org.apache.spark.repl.SparkIMain $ ReadEvalPrint.call(SparkIMain.scala:1065)     在   org.apache.spark.repl.SparkIMain $ Request.loadAndRun(SparkIMain.scala:1338)     在   org.apache.spark.repl.SparkIMain.loadAndRunReq $ 1(SparkIMain.scala:840)     在org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)     在org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)     在   org.apache.spark.repl.SparkILoop.reallyInterpret $ 1(SparkILoop.scala:857)     在   org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)     在org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)at   org.apache.spark.repl.SparkILoop.processLine $ 1(SparkILoop.scala:657)     在org.apache.spark.repl.SparkILoop.innerLoop $ 1(SparkILoop.scala:665)     在   org.apache.spark.repl.SparkILoop.org $阿帕奇$火花$ REPL $ SparkILoop $$环(SparkILoop.scala:670)     在   org.apache.spark.repl.SparkILoop $$ anonfun $ $组织阿帕奇$火花$ REPL $ SparkILoop $$过程$ 1.适用$ MCZ $ SP(SparkILoop.scala:997)     在   org.apache.spark.repl.SparkILoop $$ anonfun $ $组织阿帕奇$火花$ REPL $ SparkILoop $$过程$ 1.适用(SparkILoop.scala:945)     在   org.apache.spark.repl.SparkILoop $$ anonfun $ $组织阿帕奇$火花$ REPL $ SparkILoop $$过程$ 1.适用(SparkILoop.scala:945)     在   scala.tools.nsc.util.ScalaClassLoader $ .savingContextLoader(ScalaClassLoader.scala:135)     在   org.apache.spark.repl.SparkILoop.org $阿帕奇$火花$ REPL $ SparkILoop $$过程(SparkILoop.scala:945)     在org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)     在org.apache.spark.repl.Main $ .main(Main.scala:31)at   org.apache.spark.repl.Main.main(Main.scala)at   sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.lang.reflect.Method.invoke(Method.java:497)at   org.apache.spark.deploy.SparkSubmit $ .ORG $阿帕奇$火花$部署$ SparkSubmit $$ runMain(SparkSubmit.scala:665)     在   org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)     在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:193)     在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)     在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

帮助会很棒。

Ĵ

2 个答案:

答案 0 :(得分:5)

可能会令人惊讶,但是:

var sql = "'Ulmus_minor_'Toledo'"
df.filter(s"""topic = "$sql"""")

工作正常,但使用它会更加清晰:

df.filter($"topic" <=> sql)

答案 1 :(得分:0)

该问题的标题通常涉及在SparkSQL中转义字符串,因此,提供一个适用于任何字符串的答案可能有好处,而与在表达式中如何使用无关。

def sqlEscape(s: String) = 
  org.apache.spark.sql.catalyst.expressions.Literal(s).sql

sqlEscape("'Ulmus_minor_'Toledo' and \"om\"")
res0: String = '\'Ulmus_minor_\'Toledo\' and "om"'