我很难提出一个有效的查询,该查询将具有不同属性的两个表进行比较。这是给有数十万个SKU可供销售的在线零售商的报告。每个SKU是“父”产品的变体。他们在各个市场上销售商品,因此需要查看是否有一些商品无法在各个地方出售。
有一个包含所有父产品的表,另一个是包含所有变体及其对应的SKU的表。在第三个表格中,他们具有每个sku(变体)的完整列表,并且是sku +市场组合唯一的相应市场。
数据库使用PostgreSQL
表结构如下:
产品表:
Products
id | parent_sku | vendor_id
-------------------------------
1 | ABC | 100
2 | DEF | 200
3 | XYZ | 100
变化表:
Variations
id | parent_id | sku
----------------------------
1 | 1 | ABC-1
2 | 1 | ABC-2
3 | 1 | ABC-3
4 | 2 | DEF-1
5 | 2 | DEF-2
6 | 3 | XYZ-1
7 | 3 | XYZ-2
市场表:
MarketplaceData
id | sku | marketplace | price
----------------------------
1 | ABC-1 | website1 | 99.99
2 | ABC-2 | website1 | 99.99
3 | ABC-3 | website1 | 89.99
4 | DEF-1 | website1 | 29.99
5 | DEF-2 | website1 | 29.99
6 | XYZ-1 | website1 | 39.99
7 | XYZ-2 | website1 | 39.99
8 | ABC-1 | website2 | 99.99
9 | ABC-2 | website2 | 99.99
10 | ABC-3 | website2 | 99.99
11 | DEF-1 | website2 | 29.99
12 | DEF-2 | website2 | 29.99
13 | XYZ-1 | website2 | 34.99
14 | XYZ-2 | website2 | 34.99
我有一个有效的查询,但是执行时间非常长,而且非常费力。
SELECT DISTINCT parent_id FROM Variations
WHERE sku IN (SELECT sku FROM MarketplaceData WHERE marketplace IN ('website1','website2'))
AND sku NOT IN (SELECT sku FROM MarketplaceData WHERE marketplace IN ('website3','website4'))
LIMIT 20 OFFSET 0
由于每个sku +市场数据集都有近40万行,而MarketplaceData表包含超过200万行,因此该查询将永远执行。
就索引而言,id列是每个索引的主键。 Variations表在sku上有一个索引(必须是唯一的),而MarketplaceData在sku + marketplace上有索引。
最终,我需要的是符合条件的唯一parent_id的列表。
任何帮助或指导将不胜感激。
谢谢!
答案 0 :(得分:1)
代替IN和NOT IN可以使用INNER JOIN和LEFT JOIN来检查null
SELECT DISTINCT v.parent_id
FROM Variations v
INNER JOIN (
SELECT sku FROM MarketplaceData WHERE marketplace IN ('website1','website2')
) t1 on t1.sku = v.sku
LEFT JOIN (
SELECT sku FROM MarketplaceData WHERE marketplace IN ('website3','website4')
) t2 On t2.sku = v.sku
WHERE t2.sku is null
答案 1 :(得分:0)
为什么只使用一个子查询?
SELECT DISTINCT parent_id
FROM Variations
WHERE sku IN (SELECT sku FROM MarketplaceData WHERE marketplace IN ('website1','website2')
except
SELECT sku FROM MarketplaceData WHERE marketplace IN ('website3','website4'))
LIMIT 20 OFFSET 0
答案 2 :(得分:0)
如何通过简单的聚合来获得skus?
select mpd.sku
from MarketplaceData mpd
where mpd.marketplace in ('website1', 'website2', 'website3', 'website4')
group by mpd.sku
having count(*) filter (where mpd.marketplace in ('website1', 'website2')) > 0 and
count(*) filter (where mpd.marketplace in ('website3', 'website4')) = 0;
然后获取父ID:
select distinct v.parent_id
from variations v join
(select mpd.sku
from MarketplaceData mpd
where mpd.marketplace in ('website1', 'website2', 'website3', 'website4')
group by mpd.sku
having count(*) filter (where mpd.marketplace in ('website1', 'website2')) > 0 and
count(*) filter (where mpd.marketplace in ('website3', 'website4')) = 0
) m
on m.sku = v.sku;