如何将参数传递给selectExpr? SparkSQL - 斯卡拉

时间:2017-10-31 16:59:12

标签: apache-spark-sql spark-dataframe

:)

如果您有数据框,则可以使用方法selectExprt

添加列并填充其行

这样的事情:

scala> table.show
+------+--------+---------+--------+--------+
|idempr|tipperrd| codperrd|tipperrt|codperrt|
+------+--------+---------+--------+--------+
|  OlcM|       h|999999999|       J|       0|
|  zOcQ|       r|777777777|       J|       1|
|  kyGp|       t|333333333|       J|       2|
|  BEuX|       A|999999999|       F|       3|

scala> var table2 = table.selectExpr("idempr", "tipperrd", "codperrd", "tipperrt", "codperrt", "'hola' as Saludo")
tabla: org.apache.spark.sql.DataFrame = [idempr: string, tipperrd: string, codperrd: decimal(9,0), tipperrt: string, codperrt: decimal(9,0), Saludo: string]

scala> table2.show
+------+--------+---------+--------+--------+------+
|idempr|tipperrd| codperrd|tipperrt|codperrt|Saludo|
+------+--------+---------+--------+--------+------+
|  OlcM|       h|999999999|       J|       0|  hola|
|  zOcQ|       r|777777777|       J|       1|  hola|
|  kyGp|       t|333333333|       J|       2|  hola|
|  BEuX|       A|999999999|       F|       3|  hola|

我的观点是:

我定义字符串并调用一个方法,该方法使用此String参数填充数据框中的列。但我无法做select expresion获取字符串(我试过$,+等等)。要实现这样的目标:

scala> var english = "hello"

scala> def generar_informe(df: DataFrame, tabla: String) {
    var selectExpr_df = df.selectExpr(
      "TIPPERSCON_BAS as TIP.PERSONA CONTACTABILIDAD",
      "CODPERSCON_BAS as COD.PERSONA CONTACTABILIDAD",
      "'tabla' as PUNTO DEL FLUJO" )
}

scala> generar_informe(df,english)

.....

scala> table2.show
+------+--------+---------+--------+--------+------+
|idempr|tipperrd| codperrd|tipperrt|codperrt|Saludo|
+------+--------+---------+--------+--------+------+
|  OlcM|       h|999999999|       J|       0|  hello|
|  zOcQ|       r|777777777|       J|       1|  hello|
|  kyGp|       t|333333333|       J|       2|  hello|
|  BEuX|       A|999999999|       F|       3|  hello|

我试过了:

scala> var result = tabl.selectExpr("A", "B", "$tabla as C")

scala> var abc = tabl.selectExpr("A", "B", ${tabla} as C)
    <console>:31: error: not found: value $
             var abc = tabl.selectExpr("A", "B", ${tabla} as C)

scala> var abc = tabl.selectExpr("A", "B", "${tabla} as C")

scala> sqlContext.sql("set tabla='hello'")
scala> var abc = tabl.selectExpr("A", "B", "${tabla} as C")

相同错误:

java.lang.RuntimeException: [1.1] failure: identifier expected
${tabla} as C
^
    at scala.sys.package$.error(package.scala:27)

提前致谢!

1 个答案:

答案 0 :(得分:1)

你能试试吗?

val english = "hello"
    generar_informe(data,english).show()

  }

  def generar_informe(df: DataFrame , english : String)={
    df.selectExpr(
      "transactionId" , "customerId" , "itemId","amountPaid" , s"""'${english}' as saludo """)
  }

这是我得到的输出。

17/11/02 23:56:44 INFO CodeGenerator: Code generated in 13.857987 ms
+-------------+----------+------+----------+------+
|transactionId|customerId|itemId|amountPaid|saludo|
+-------------+----------+------+----------+------+
|          111|         1|     1|     100.0| hello|
|          112|         2|     2|     505.0| hello|
|          113|         3|     3|     510.0| hello|
|          114|         4|     4|     600.0| hello|
|          115|         1|     2|     500.0| hello|
|          116|         1|     2|     500.0| hello|
|          117|         1|     2|     500.0| hello|
|          118|         1|     2|     500.0| hello|
|          119|         2|     3|     500.0| hello|
|          120|         1|     2|     500.0| hello|
|          121|         1|     4|     500.0| hello|
|          122|         1|     2|     500.0| hello|
|          123|         1|     4|     500.0| hello|
|          124|         1|     2|     500.0| hello|
+-------------+----------+------+----------+------+

17/11/02 23:56:44 INFO SparkContext: Invoking stop() from shutdown hook