BigQuery - 如何在组中选择条件

时间:2015-07-31 23:39:38

标签: google-bigquery

我的记录如下,user_iddatecountry等。 有些国家是“未知的”。当我按user_id分组时,我希望能够在未知之后返回下一个值(如果存在)。如果没有,请返回未知。

所以,从这样的输入数据:

user_id  |   date     | country | gender
----------------------------------------
   A       2015-10-01   unknown     M
   A       2015-10-02      US       M
   B       2015-10-01      CA       M
   B       2015-10-02      US       M
   C       2015-10-04      US       M
   C       2015-10-06      US       M

我想要一个返回的查询:

   date     | country | gender | num_users
-------------------------------------------
 2015-10-02      US       M          2
 2015-10-01      CA       M          1
 2015-10-04      US       M          1

我目前正在使用普通GROUP EACH BY,但这不能考虑未知数。

SELECT
  FIRST(date),
  FIRST(country),
  COUNT(DISTINCT user_id,50000000) AS num_users
FROM
  my_table
WHERE
  date BETWEEN '2015-10-01' AND CURRENT_DATE()
GROUP BY
  date,
  country

我使用BigQuery,但很可能适应任何解决方案。 有什么想法吗?谢谢。

1 个答案:

答案 0 :(得分:2)

这是解决问题的一种方法。该示例显示了所有国家/地区对同一用户“未知”且仅有部分未知的情况

select 
  user_id,
  first(date), 
  ifnull(first(if(country = "unknown", null, country)), "unknown") from
(select "A" user_id, "2015-10-01" date, "unknown" country),
(select "A" user_id, "2015-10-02" date, "unknown" country),
(select "B" user_id, "2015-10-01" date, "CA" country),
(select "B" user_id, "2015-10-02" date, "US" country),
(select "C" user_id, "2015-10-04" date, "unknown" country),
(select "C" user_id, "2015-10-06" date, "US" country)
group by user_id