我的记录如下,user_id
,date
,country
等。
有些国家是“未知的”。当我按user_id
分组时,我希望能够在未知之后返回下一个值(如果存在)。如果没有,请返回未知。
所以,从这样的输入数据:
user_id | date | country | gender
----------------------------------------
A 2015-10-01 unknown M
A 2015-10-02 US M
B 2015-10-01 CA M
B 2015-10-02 US M
C 2015-10-04 US M
C 2015-10-06 US M
我想要一个返回的查询:
date | country | gender | num_users
-------------------------------------------
2015-10-02 US M 2
2015-10-01 CA M 1
2015-10-04 US M 1
我目前正在使用普通GROUP EACH BY
,但这不能考虑未知数。
SELECT
FIRST(date),
FIRST(country),
COUNT(DISTINCT user_id,50000000) AS num_users
FROM
my_table
WHERE
date BETWEEN '2015-10-01' AND CURRENT_DATE()
GROUP BY
date,
country
我使用BigQuery,但很可能适应任何解决方案。 有什么想法吗?谢谢。
答案 0 :(得分:2)
这是解决问题的一种方法。该示例显示了所有国家/地区对同一用户“未知”且仅有部分未知的情况
select
user_id,
first(date),
ifnull(first(if(country = "unknown", null, country)), "unknown") from
(select "A" user_id, "2015-10-01" date, "unknown" country),
(select "A" user_id, "2015-10-02" date, "unknown" country),
(select "B" user_id, "2015-10-01" date, "CA" country),
(select "B" user_id, "2015-10-02" date, "US" country),
(select "C" user_id, "2015-10-04" date, "unknown" country),
(select "C" user_id, "2015-10-06" date, "US" country)
group by user_id