我有一个包含所有字符串列的表,但我知道某些列是数字(或日期)。 BigQuery中是否有内置函数来推断各列的数据类型?像table_name中的select is_string(column_name)?
答案 0 :(得分:1)
我想到的一个想法是将SAFE_CAST
与LOGICAL_AND
结合使用,例如:
#standardSQL
WITH T AS (
SELECT '2017-05-01' AS x, '3.14' AS y, '5' AS z UNION ALL
SELECT '2017-03-02' AS x, '1.59' AS y, '-1' AS z UNION ALL
SELECT NULL AS x, NULL AS y, NULL AS z
)
SELECT
LOGICAL_AND(x IS NULL OR SAFE_CAST(x AS DATE) IS NOT NULL) AS x_is_date,
LOGICAL_AND(y IS NULL OR SAFE_CAST(y AS FLOAT64) IS NOT NULL) AS y_is_float64,
LOGICAL_AND(z IS NULL OR SAFE_CAST(z AS TIMESTAMP) IS NOT NULL) AS z_is_timestamp
FROM T;
这会导致true,true和false(z
值不是时间戳)。如果要多次重用同一个表达式,可以使用SQL UDF使其更简洁:
#standardSQL
CREATE TEMP FUNCTION IsDate(x STRING) AS (
x IS NULL OR SAFE_CAST(x AS DATE) IS NOT NULL
);
WITH T AS (
SELECT '2017-05-01' AS x, '3.14' AS y, '5' AS z UNION ALL
SELECT '2017-03-02' AS x, '1.59' AS y, '-1' AS z UNION ALL
SELECT NULL AS x, NULL AS y, NULL AS z
)
SELECT
LOGICAL_AND(IsDate(x)) AS x_is_date,
LOGICAL_AND(IsDate(y)) AS y_is_date,
LOGICAL_AND(IsDate(z)) AS z_is_date
FROM T;
这会导致true,false,false,因为只有x
具有日期格式的值。