是否可以自我加入以询问不同的日期?

时间:2016-11-10 11:01:07

标签: sql google-bigquery self-join

试图搜索答案,阅读这样的帖子:SQL Self-join with data comparison for different days 但是不能完全理解这在这种情况下是如何工作的。

感谢任何帮助;

我有一张带

的表格
  • UserID(号码)
  • UserType(字符串,显示他们是会员还是访客)

  • sales_date(datestamp field)

  • (加上其他专栏,比如他们买的东西以及我现在不感兴趣的商品的费用)

我正在尝试编写一个查询,告诉我每月有多少人成为会员和成为访客。 所以我可以回答类似的问题 " 9月有多少人在这里,10月又回来了?" " 9月份会员有多少人,但在10月降级成为客人?" " 9月份有多少人是客人,但在10月升级为会员?"

1:当需要从同一个表/同一个查询中请求两个不同的日期范围时,自我加入的方式是什么?

2:我想我需要提出UserID,然后是针对Sept的UserType和针对10月份的UserType。这听起来不错吗?不知道如何要求2个不同的日期

SELECT
      t1.UserID,
      t1.UserType as UserTypeSept,
      t2.UserType as UserTypeOct
   FROM 
      my_table t1
         join my_table t2
            on t1.UserID = t2.UserID
           AND t2.day > '2015-01-01' AND t2.day < '2015-02-01'
   where
      t1.day  >'2015-02-01' AND t1.day <'2015-03-01'
;

我是否正沿着正确的车道思考? 即使这样有效,它也不会告诉我有多少人从&#34;会员&#34;到了#34;嘉宾&#34;从9月到10月,但至少在两个不同的列中显示它们的值

感谢

2 个答案:

答案 0 :(得分:0)

我建议使用更昂贵的分析函数而不是自联接。您的数据适合窗口数据。 请在查询下方运行,然后调整到您的表格。 您可能需要格式化打印期间,并使用CASE子句进行下一个月之间的转换,例如&#34;会员 - 访客&#34;更有意义的名字。

      WITH
  members AS ( 
  SELECT 1 AS UserID, 'Member' AS UserType,  TIMESTAMP '2015-01-01' AS sales_date
  UNION ALL SELECT 1 AS UserID, 'Guest' AS UserType, TIMESTAMP '2015-02-01' AS sales_date 
  UNION ALL SELECT 2 AS UserID, 'Guest' AS UserType, TIMESTAMP '2015-01-01' AS sales_date
  UNION ALL SELECT 2 AS UserID, 'Member' AS UserType,TIMESTAMP '2015-02-01' AS sales_date
  UNION ALL SELECT 3 AS UserID, 'Guest' AS UserType, TIMESTAMP '2015-01-01' AS sales_date
  UNION ALL SELECT 3 AS UserID, 'Guest' AS UserType, TIMESTAMP '2015-02-01' AS sales_date
  UNION ALL SELECT 4 AS UserID, 'Guest' AS UserType, TIMESTAMP '2015-01-01' AS sales_date
  UNION ALL SELECT 4 AS UserID, 'Member' AS UserType,TIMESTAMP '2015-02-01' AS sales_date
  UNION ALL SELECT 5 AS UserID, 'Guest' AS UserType, TIMESTAMP '2016-07-01' AS sales_date 
  UNION ALL SELECT 5 AS UserID, 'Guest' AS UserType, TIMESTAMP '2016-08-01' AS sales_date
  UNION ALL SELECT 6 AS UserID, 'Member' AS UserType,TIMESTAMP '2016-03-01' AS sales_date
  UNION ALL SELECT 7 AS UserID, 'Guest' AS UserType, TIMESTAMP '2016-04-01' AS sales_date
  UNION ALL SELECT 7 AS UserID, 'Guest' AS UserType, TIMESTAMP '2016-05-01' AS sales_date
  UNION ALL SELECT 8 AS UserID, 'Guest' AS UserType, TIMESTAMP '2016-01-01' AS sales_date
  UNION ALL SELECT 8 AS UserID, 'Member' AS UserType,TIMESTAMP '2016-02-01' AS sales_date
  UNION ALL SELECT 9 AS UserID, 'Guest' AS UserType, TIMESTAMP '2016-01-03' AS sales_date
  UNION ALL SELECT 9 AS UserID, 'Member' AS UserType,TIMESTAMP '2016-02-06' AS sales_date)
SELECT
  COUNT(*),
  member,
  period,
  year
FROM (
  SELECT
    UserType,
    UserID,
    sales_date,
    FORMAT_DATE("%Y",DATE(sales_date)) AS year,
    CONCAT(
    FORMAT_DATE("%b",DATE(sales_date)),
    ' - ',
    FORMAT_DATE("%b", DATE(LEAD(sales_date,1) OVER (PARTITION BY userId ORDER BY sales_date ASC)))
    ) AS period,
    CONCAT(UserType,' - ', LEAD(UserType,1) OVER (PARTITION BY userId ORDER BY sales_date ASC)) AS member
  FROM
    members
  ORDER BY
    userid )
WHERE
  member IS NOT NULL
  and year = '2016'
GROUP BY
year,
  member,
  period

答案 1 :(得分:0)

  

1:当需要请求2个不同的日期时,自我加入的方式   范围从同一个表/相同的查询?

不是真的!这取决于!在您的情况下 - 请参阅下面的#2

  

2:我想我需要问UserID,然后是Sept vs的UserType   十月的UserType

我认为以下是你所期望的 请注意:它在每个月末查找UserType,并将其用作相应月份的用户类型。

/*
WITH my_table AS (
  SELECT 1 AS UserID, 'Member' AS UserType, TIMESTAMP '2015-09-01' AS sales_date UNION ALL
  SELECT 1 AS UserID, 'Member' AS UserType, TIMESTAMP '2015-09-02' AS sales_date UNION ALL
  SELECT 1 AS UserID, 'Member' AS UserType, TIMESTAMP '2015-09-03' AS sales_date UNION ALL
  SELECT 1 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-09-10' AS sales_date UNION ALL
  SELECT 1 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-10-01' AS sales_date UNION ALL
  SELECT 1 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-10-02' AS sales_date UNION ALL
  SELECT 2 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-09-01' AS sales_date UNION ALL
  SELECT 2 AS UserID, 'Member' AS UserType, TIMESTAMP '2015-10-01' AS sales_date UNION ALL
  SELECT 3 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-09-01' AS sales_date UNION ALL
  SELECT 3 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-10-01' AS sales_date UNION ALL
  SELECT 4 AS UserID,  'Guest' AS UserType, TIMESTAMP '2015-09-01' AS sales_date UNION ALL
  SELECT 4 AS UserID, 'Member' AS UserType, TIMESTAMP '2015-10-01' AS sales_date ) 
*/
SELECT 
  UserID,
  MAX(CASE WHEN sales_year_month = '2015-09' THEN UserTypeAtEndOfMonth END) AS UserTypeSept,
  MAX(CASE WHEN sales_year_month = '2015-10' THEN UserTypeAtEndOfMonth END) AS UserTypeOct
FROM (
  SELECT 
    UserID, 
    FORMAT_DATE('%Y-%m', DATE(sales_date)) AS sales_year_month,
    ARRAY_AGG(UserType ORDER BY sales_date DESC LIMIT 1)[OFFSET(0)] AS UserTypeAtEndOfMonth
  FROM my_table 
  GROUP BY 1, 2
)
GROUP BY 1

如果要在样本数据上进行测试,可以删除注释