在SparkSQL文档中,有一个when函数返回一列。给出的示例如下:
people.select(when(people("gender") === "male", 0)
.when(people("gender") === "female", 1)
.otherwise(2))
在此示例中,when条件的结果为0、1或2。但是,如果我希望结果成为people DataFrame的列怎么办?例如,给定以下数据:
id | name | gender | testosterone | estrogen
-----------------------------------------------
1 | Joe | male | 10 | 2
2 | Sue | female | 3 | 12
3 | John | male | 9 | 3
4 | Kim | female | 1 | 10
我想要这样的东西:
SELECT
name,
CASE WHEN gender = "male" THEN testosterone
WHEN gender = "female" THEN estrogen
END AS hormone_level
FROM
people
结果将是:
name | hormone_level
-----------------------
Joe | 10
Sue | 12
John | 9
Kim | 10
答案 0 :(得分:3)
只是
when(people("gender") === "female", people("estrogen"))
.when(people("gender") === "male", people("testosterone"))
// .otherwise(???) Add base-case if required