我有一个字符串列,其值是日期。有些有效(yyyy-MM-dd
),有些无效。如何仅使用Hive过滤有效和无效?我不能使用自定义UDF或Spark,因此它只能使用Hive函数。
select * from date_test;
+-------------------+--+
| date_test.mydate |
+-------------------+--+
| 2018-12-13 | => valid
| 2018-13-12 | => invalid
| 2018-04-31 | => invalid
+-------------------+--+
select mydate,to_date(mydate) from date_test;
+-------------+-------------+--+
| mydate | _c1 |
+-------------+-------------+--+
| 2018-12-13 | 2018-12-13 |
| 2018-13-12 | 2019-01-12 | => to_date() casts it to valid value
| 2018-04-31 | 2018-05-01 | => to_date() casts it to valid value
+-------------+-------------+--+
答案 0 :(得分:1)
我已经设法做到了,但是我愿意接受其他更好的方法。
//valid date values
select
mydate,
to_date(mydate)
from
date_test
where
mydate = to_date(mydate);
+-------------+-------------+--+
| mydate | _c1 |
+-------------+-------------+--+
| 2018-12-13 | 2018-12-13 |
+-------------+-------------+--+
//invalid date values
select
mydate,
to_date(mydate)
from
date_test
where
mydate <> to_date(mydate);
+-------------+-------------+--+
| mydate | _c1 |
+-------------+-------------+--+
| 2018-13-12 | 2019-01-12 |
| 2018-04-31 | 2018-05-01 |
+-------------+-------------+--+