SQL保留查询

时间:2014-09-12 08:47:12

标签: sql hiveql retention

我在sql(以及任何类型的编码)中完成了新手,但是我试图编写一个基本查询,按国家/地区返回首次登录的用户数量8月15日之后,以及第一次会议后第二天回来的用户数量。

我正在处理一个名为events的表,其中包含以下列: utc_timestamp,name,id和包含此事件的几个参数的json字符串(如您所见,我用它来检索会话号和国家/地区)

当我运行此查询时,它会说"第5行:无法识别' SELECT'附近的输入。 ' DISTINCT' ' ID'在功能规范"。我尝试在select和DISTINCT之间放置括号,我仍然会收到相同的错误消息。是什么导致它的想法?

感谢您的帮助

SELECT 
get_json_object(json, '$. User_Country ') AS country
, COUNT(DISTINCT id) AS Users
, COUNT(
    SELECT DISTINCT id 
    FROM events 
    WHERE EXISTS(
        SELECT * 
        FROM events
        WHERE name = "Logged_in" 
        AND utc_timestamp>(
            (
            SELECT utc_timestamp
            FROM events
            WHERE month = 201408
            AND name = "Logged_in" 
            AND get_json_object(json, '$. Session_nb ') = 0
            AND utc_timestamp > UNIXTIMESTAMP('2014-08-15 12:00:00')
            ) + INTERVAL '1 day'
        )
    )
) AS Retained1
FROM events 
WHERE month = 201408 
AND name = "Logged_in" 
AND get_json_object(json, '$. Session_nb ') = 0
AND utc_timestamp > UNIXTIMESTAMP('2014-08-15 12:00:00')
GROUP BY (get_json_object(json, '$. User_Country ')) 
ORDER BY (get_json_object(json, '$. User_Country '))

1 个答案:

答案 0 :(得分:1)

不计算子查询结果。相反,在子查询本身中使用COUNT

SELECT 
get_json_object(json, '$. User_Country ') AS country
, COUNT(DISTINCT id) AS Users
, (
    SELECT COUNT(DISTINCT id)
    FROM events 
    WHERE EXISTS(
        SELECT * 
        FROM events
        WHERE name = "Logged_in" 
        AND utc_timestamp>(
            (
            SELECT utc_timestamp
            FROM events
            WHERE month = 201408
            AND name = "Logged_in" 
            AND get_json_object(json, '$. Session_nb ') = 0
            AND utc_timestamp > UNIXTIMESTAMP('2014-08-15 12:00:00')
            ) + INTERVAL '1 day'
        )
    )
) AS Retained1
FROM events 
WHERE month = 201408 
AND name = "Logged_in" 
AND get_json_object(json, '$. Session_nb ') = 0
AND utc_timestamp > UNIXTIMESTAMP('2014-08-15 12:00:00')
GROUP BY (get_json_object(json, '$. User_Country ')) 
ORDER BY (get_json_object(json, '$. User_Country '))