我有一些像这样的表:
我正试图INNER JOIN这两个表,以便获得如下内容:
time | block_height | differential_pressure |
---------------------+--------------+-----------------------+
2018-09-08 11:14:10 | 83.7 | 286.84 |
2018-09-08 11:14:10 | 83.6 | 282.14 |
2018-09-08 11:14:11 | 83.4 | 298.35 |
2018-09-08 11:14:12 | 83.1 | 298.23 |
2018-09-08 11:14:12 | 82.9 | 294.76 |
2018-09-08 11:14:13 | 82.7 | 288.37 |
但是当我运行以下查询时:
SELECT * FROM rt_block_height
INNER JOIN rt_differential_pressure
ON rt_block_height.time = rt_differential_pressure.time;
这就是我得到的:
我不明白这里发生了什么。似乎添加了一些随机的其他行,但是我不知道它为什么发生。原始表中只有6行,但是查询的表返回10。
我不知道此信息是否有帮助,但这是一个TimescaleDB Hypertable。这是表创建的源代码:
CREATE TABLE IF NOT EXISTS public.rt_BLOCK_HEIGHT
(
"time" timestamp without time zone,
BLOCK_HEIGHT double precision
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.rt_BLOCK_HEIGHT
OWNER to postgres;
SELECT create_hypertable('rt_BLOCK_HEIGHT', 'time');
答案 0 :(得分:3)
您的时间列不是唯一的。
对于2018-09-08 11:14:10
时间戳,您有:
block_heightA = 83.7
block_heightB = 83.6
differential_pressureA = 286.84
differential_pressureB = 282.14
因此,当您进行联接时,将获得两个2元素集的笛卡尔积:
2018-09-08 11:14:10 block_heightA differential_pressureA
2018-09-08 11:14:10 block_heightA differential_pressureB
2018-09-08 11:14:10 block_heightB differential_pressureA
2018-09-08 11:14:10 block_heightB differential_pressureB
要获得结果,您需要决定如何处理每个时间戳记的重复值。例如,您可以计算平均值:
SELECT
grouped_block_height.time,
avg_block_height,
avg_differential_pressure
FROM (
SELECT time, avg(block_height) as avg_block_height
FROM rt_block_height
GROUP BY time
) as grouped_block_height
INNER JOIN (
SELECT time, avg(differential_pressure) as avg_differential_pressure
FROM rt_differential_pressure
GROUP BY time
) as grouped_differential_pressure
ON grouped_block_height.time = grouped_differential_pressure.time;