我有一个如下所示的表格
我正在尝试下面的方法,但是它肯定不是优雅而有效的。这是唯一的方法吗?
我正在寻找 Bigquery和Postgresql
中的方法SELECT
CASE
WHEN date_1 >= date_2 AND date_1 >= date_3 AND date_1 >= date_4 AND date_1 >= date_5 AND date_1 >= date_6 THEN date_1
WHEN date_2 >= date_1 AND date_2 >= date_3 AND date_2 >= date_4 AND date_2 >= date_5 AND date_2 >= date_6 end AS max_date
from table_1
我希望我的输出如下所示
答案 0 :(得分:2)
在PostgreSQL中,您可以使用GREATEST
表达式:
SELECT GREATEST(date_1, date_2, date_3, date_4, date_5, date_6) AS max_date
...
由于它不是标准SQL,它可能无法在其他数据库中工作。
无论如何,您可以减少比较的次数,因为WHEN
语句的第二个CASE
表达式仅在第一个不是TRUE
的情况下才进行测试:
CASE
WHEN date_1 >= date_2 AND date_1 >= date_3 AND date_1 >= date_4 AND date_1 >= date_5 AND date_1 >= date_6
THEN date_1
WHEN date_2 >= date_3 AND date_2 >= date_4 AND date_2 >= date_5 AND date_2 >= date_6
THEN date_2
WHEN date_3 >= date_4 AND date_3 >= date_5 AND date_3 >= date_6
THEN date_3
WHEN date_4 >= date_5 AND date_4 >= date_6
THEN date_4
WHEN date_5 >= date_6
THEN date_5
ELSE date_6
END
我不知道您是否认为这更优雅,但是除了AND
子句外,您还可以将ALL
与VALUES
表达式一起使用:
WHEN date_1 >= ALL (VALUES (date_2), (date_3), (date_4), (date_5), (date_6))
THEN date_1
...
答案 1 :(得分:1)
您可以尝试以下方法:
SELECT subject_id, hadm_id, icustay_id,
(
SELECT Max(v)
FROM (VALUES (date_1), (date_2), (date_3),(date_4),(date_5),(date_6)) AS value(v)
) as max_date
FROM Table_Name
GROUP BY subject_id, hadm_id, icustay_id
答案 2 :(得分:1)
另一个选项如下。参见here
WITH table_2 AS (
SELECT subject_id,hadm_id,icust_ay_id,date_1 AS date_x FROM table_1
UNION ALL SELECT subject_id,hadm_id,icust_ay_id,date_2 AS date_x FROM table_1
UNION ALL SELECT subject_id,hadm_id,icust_ay_id,date_3 AS date_x FROM table_1
UNION ALL SELECT subject_id,hadm_id,icust_ay_id,date_4 AS date_x FROM table_1
UNION ALL SELECT subject_id,hadm_id,icust_ay_id,date_5 AS date_x FROM table_1
UNION ALL SELECT subject_id,hadm_id,icust_ay_id,date_6 AS date_x FROM table_1
)
SELECT subject_id,hadm_id,icust_ay_id,MAX(date_x) FROM table_2
GROUP BY subject_id,hadm_id,icust_ay_id
答案 3 :(得分:1)
下面的内容适用于 BigQuery Standard SQL ,并且几乎没有关于采样数据的假设:您的data_N列为DATE类型,因此空值实际上为NULL。在这种情况下,您可以使用下面的方法在行
的各列中查找最大日期#standardSQL
SELECT * EXCEPT(date_1, date_2, date_3, date_4, date_5, date_6),
(SELECT MAX(val) FROM UNNEST(
SPLIT(REGEXP_REPLACE(FORMAT('%t', [date_1, date_2, date_3, date_4, date_5, date_6]), r'[\[\] ]', ''))
) val
WHERE val != 'NULL'
) max_date
FROM `project.dataset.table`
如果date_N列为STRING类型,则可以在下面使用
#standardSQL
SELECT * EXCEPT(date_1, date_2, date_3, date_4, date_5, date_6),
(SELECT MAX(PARSE_DATE('%m/%d/%Y', val)) FROM UNNEST(
SPLIT(REGEXP_REPLACE(FORMAT('%t', [date_1, date_2, date_3, date_4, date_5, date_6]), r'[\[\] ]', ''))) val
WHERE val != ''
) max_date
FROM `project.dataset.table`
答案 4 :(得分:1)
这应该很好,并且可以在两种BigQuery / PGSQL环境中使用:
select
subject_id,
hadm_id,
icustay_id,
greatest(
coalesce(date_1, '1900/01/01'),
coalesce(date_2, '1900/01/01'),
coalesce(date_3, '1900/01/01')
coalesce(date_4, '1900/01/01'),
coalesce(date_5, '1900/01/01'),
coalesce(date_6, '1900/01/01')
) as max_date
from `dataset.table`
希望有帮助。