我有一个简单的表,其中包含经度,纬度和时间。基本上,我希望查询结果能给我这样的信息:
lat,long,hourwindow,count
我似乎不知道该怎么做。我已经尝试了很多东西,我无法保持直率。不幸的是,这是到目前为止我得到的:
WITH all_lat_long_by_time AS (
SELECT
trunc(cast(lat AS NUMERIC), 4) AS lat,
trunc(cast(long AS NUMERIC), 4) AS long,
date_trunc('hour', time :: TIMESTAMP WITHOUT TIME ZONE) AS hourWindow
FROM my_table
),
unique_lat_long_by_time AS (
SELECT DISTINCT * FROM all_lat_long_by_time
),
all_with_counts AS (
-- what do I do here?
)
SELECT * FROM all_with_counts;
答案 0 :(得分:1)
我认为这是非常基本的聚合查询:
SELECT date_trunc('hour', time :: TIMESTAMP WITHOUT TIME ZONE) AS hourWindow
trunc(cast(lat AS NUMERIC), 4) AS lat,
trunc(cast(long AS NUMERIC), 4) AS long,
COUNT(*)
FROM my_table
GROUP BY hourWindow, trunc(cast(lat AS NUMERIC), 4), trunc(cast(long AS NUMERIC), 4)
ORDER BY hourWindow
答案 1 :(得分:0)
如果“按唯一性计算的行数”是要每小时(截断数字后)每小时计算 distinct 个坐标,则count(DISTINCT (lat,long))
会执行以下操作:
SELECT date_trunc('hour', time::timestamp) AS hour_window
, count(DISTINCT (trunc( lat::numeric, 4)
, trunc(long::numeric, 4))) AS count_distinct_coordinates
FROM tbl
GROUP BY 1
ORDER BY 1;
手册here中的详细信息。
(lat,long)
是ROW值,是ROW(lat,long)
的缩写。更多here。
但是count(DISTINCT ...)
通常比较慢,对于您的情况,子查询应该更快:
SELECT hour_window, count(*) AS count_distinct_coordinates
FROM (
SELECT date_trunc('hour', time::timestamp) AS hour_window
, trunc( lat::numeric, 4) AS lat
, trunc(long::numeric, 4) AS long
FROM tbl
GROUP BY 1, 2, 3
) sub
GROUP BY 1
ORDER BY 1;
或者:
SELECT hour_window, count(*) AS count_distinct_coordinates
FROM (
SELECT DISTINCT
date_trunc('hour', time::timestamp) AS hour_window
, trunc( lat::numeric, 4) AS lat
, trunc(long::numeric, 4) AS long
FROM tbl
) sub
GROUP BY 1
ORDER BY 1;
子查询折叠重复项后,外部SELECT
可以使用普通count(*)
。