SPARK SQL - 当时的情况

时间:2014-08-06 10:01:31

标签: sql apache-spark

我是SPARK-SQL的新手。在SPARK SQL中是否有相当于“CASE WHEN'条件'那么0结束?”

select case when 1=1 then 1 else 0 end from table

由于 斯里达尔

4 个答案:

答案 0 :(得分:42)

在Spark 1.2.0之前

支持的语法(我刚刚在Spark 1.0.2上试过)似乎是

SELECT IF(1=1, 1, 0) FROM table

这个最近的线程http://apache-spark-user-list.1001560.n3.nabble.com/Supported-SQL-syntax-in-Spark-SQL-td9538.html链接到SQL解析器源,这可能会有所帮助,也可能没有帮助,这取决于您对Scala的舒适度。至少在第70行开始(在撰写本文时)的关键字列表应该有所帮助。

为方便起见,这里是指向来源的直接链接:https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala

Spark 1.2.0及更高版本的更新

从Spark 1.2.0开始,支持更传统的语法,以响应SPARK-3813:搜索" CASE WHEN"在test source。例如:

SELECT CASE WHEN key = 1 THEN 1 ELSE 2 END FROM testData

更新最近的地方以找出SQL分析器的语法

现在可以找到解析器源here

更新更复杂的示例

在回答下面的问题时,现代语法支持复杂的布尔条件。

SELECT
    CASE WHEN id = 1 OR id = 2 THEN "OneOrTwo" ELSE "NotOneOrTwo" END AS IdRedux
FROM customer

您可以在条件中涉及多个列。

SELECT
    CASE WHEN id = 1 OR state = 'MA' 
         THEN "OneOrMA" 
         ELSE "NotOneOrMA" END AS IdRedux
FROM customer

你也可以在表达时嵌套CASE。

SELECT
    CASE WHEN id = 1 
         THEN "OneOrMA"
         ELSE
             CASE WHEN state = 'MA' THEN "OneOrMA" ELSE "NotOneOrMA" END
    END AS IdRedux
FROM customer

答案 1 :(得分:19)

适用于Spark 2。+ Spark when function

来自文档:

  

评估条件列表并返回多个可能的结果表达式之一。如果最后没有定义,则为不匹配的条件返回null。

 // Example: encoding gender string column into integer.

   // Scala:
   people.select(when(people("gender") === "male", 0)
     .when(people("gender") === "female", 1)
     .otherwise(2))

   // Java:
   people.select(when(col("gender").equalTo("male"), 0)
     .when(col("gender").equalTo("female"), 1)
     .otherwise(2))

答案 2 :(得分:1)

此语法在 Databricks 中对我有用:

  select 
    org, 
    patient_id,
    case 
      when (age is null) then 'Not Available'
      when (age < 15) then 'Less than 15'
      when (age >= 15 and age < 25) then '15 to 25'
      when (age >= 25 and age < 35) then '25 to 35'
      when (age >= 35 and age < 45) then '35 to 45'
      when (age >= 45) then '45 and Older'
    end as age_range
  from demo

答案 3 :(得分:0)

Based on my current production code, this works

   val identifierDF = 
   tempIdentifierDF.select(tempIdentifierDF("t_item_account_id"),
   when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_cusip")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_ticker")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_isin")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_sedol")),100)
        .when(tempIdentifierDF("h_description").contains(tempIdentifierDF("t_valoren")),100)
        .otherwise(0)
        .alias("identifier_in_description_score")
    )