我有三张大表如下......
property
--------
property_id
other_prop_data
transfer_property
-----------------
property_id
transfer_id
transfer
--------
transfer_id
contract_date
transfer_price
我想返回“2012-01-01”和“2012-06-30”之间发生的所有转移的唯一属性ID列表。这是我到目前为止的代码......
SELECT *
FROM property p
JOIN
(
SELECT t.transfer_id, t.contract_date, t.transfer_price::integer, tp.property_id
FROM transfer t
LEFT JOIN transfer_property tp ON tp.transfer_id = t.transfer_id
WHERE t.contract_date BETWEEN '2012-01-01' AND '2012-06-30'
) transfer1 ON transfer1.property_id = p.property_id
AND NOT EXISTS
(
SELECT transfer2.transfer_id
FROM
(
SELECT t.transfer_id, t.contract_date, t.transfer_price::integer, tp.property_id
FROM transfer t
LEFT JOIN transfer_property tp ON tp.transfer_id = t.transfer_id
WHERE t.contract_date BETWEEN '2012-01-01' AND '2012-06-30'
) AS transfer2
WHERE transfer2.property_id = transfer1.property_id
AND transfer2.contract_date > transfer1.contract_date
)
这可行(我认为),但速度很慢。
我在...中发现了几个类似的查询 https://stackoverflow.com/questions/tagged/greatest-n-per-group ...但是我找到的大多数都是同一个表的自连接,而不是如上所述加入关系表。
我知道在MySQL中您可以使用用户变量,但我不知道如何在PostgreSQL中执行此操作,或者在这种情况下它是否是理想的解决方案。
是否有人对如何改进此查询有任何建议(甚至是如何使用与上面完全不同的方法来完成此操作)?
非常感谢任何帮助。谢谢!
此致
克里斯
PS:还尝试了DISTINCT和MAX的变体,但不相信他们会按照我使用它的方式选择最近日期的记录。
编辑: 对不起,我还应该补充一点,我在PGADMIN 1.12.3中运行我的查询
答案 0 :(得分:1)
尝试在PostgreSQL中使用ROW_NUMBER() OVER
。这是一个SQLFiddle example:
SELECT *
FROM property p
JOIN
(
SELECT t.transfer_id, t.contract_date,
t.transfer_price::integer, tp.property_id,
row_number() over
(PARTITION BY tp.property_id
ORDER BY t.contract_date desc) as rn
FROM transfer t
LEFT JOIN transfer_property tp
ON tp.transfer_id = t.transfer_id
WHERE t.contract_date BETWEEN '2012-01-01'
AND '2012-06-30'
) transfer1
ON transfer1.property_id = p.property_id
where transfer1.rn = 1
答案 1 :(得分:0)
给出骨架表:
create table property( property_id serial primary key );
create table transfer(
transfer_id serial primary key,
contract_date date not null
);
create table transfer_property (
property_id integer references property(property_id),
transfer_id integer references transfer(transfer_id)
);
和数据:
insert into property
select nextval('property_property_id_seq')
from generate_series(1,10);
insert into transfer
select nextval('transfer_transfer_id_seq'),
DATE '2012-01-01' + x * INTERVAL '1 month'
from generate_series(1,10) x;
-- Repeat this 4 or 5 times to produce a pile of duplicate entries
insert into transfer_property (transfer_id,property_id)
select transfer_id, property_id
from property cross join transfer
order by random()
limit 40;
使用:
select distinct property_id
from transfer_property tp inner join transfer t on (tp.transfer_id = t.transfer_id)
where t.contract_date between '2012-01-01' and '2012-06-30';
不足/曲解? 请发布示例数据和显示有意义的关系和预期结果的真实模式。
答案 2 :(得分:0)
“我想返回”2012-01-01“和”2012-06-30“之间发生的所有转移的唯一属性ID列表。”
对我而言,显示为:
SELECT DISTINCT tp.property_id
FROM transfer t
JOIN transfer_property tp ON tp.transfer_id = t.transfer_id
WHERE t.contract_date BETWEEN '2012-01-01' AND '2012-06-30'
;
现在把它放在CTE或子查询中,你就完成了:
WITH x1 AS (
SELECT DISTINCT tp.property_id AS property_id
FROM transfer t
JOIN transfer_property tp ON tp.transfer_id = t.transfer_id
WHERE t.contract_date BETWEEN '2012-01-01' AND '2012-06-30'
)
SELECT ...
FROM property p
JOIN x1 ON x1.property_id = p.property_id
;
我不明白NOT EXISTS子查询的目的。您只对MAX感兴趣吗?
更新:出现(从标题中)您只需要maxdate。可以通过你不存在的构造,或者子查询中的这个MAX(...)来完成;像...:
WITH m1 AS (
SELECT DISTINCT tp.property_id AS property_id
, MAX(t.contract_date) AS contract_date
FROM transfer t
JOIN transfer_property tp ON tp.transfer_id = t.transfer_id
WHERE t.contract_date BETWEEN '2012-01-01' AND '2012-06-30'
GROUP BY tp.property_id
)
SELECT ...
FROM property p
JOIN m1 ON m1.property_id = p.property_id
;