我有一个架构,我有图像,我也有这些图像的结果。结果存在于N个表中,具有不同的模式。我需要编写一个搜索查询,它可以返回所有符合某些条件(包括限制和偏移)的图像及其结果。
图像可能有10个结果(2个分类,8个检测)。我希望限制对图像起作用,而不是结果。所以,对于1张图片,我希望能够获得10行。
这是我到目前为止所拥有的。它的问题是结果行的重复和组合。即我希望每个结果都有一行,而不是那样的检测和分类。我需要UNION ALL吗?
CREATE TABLE images (
id VARCHAR(40) NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE image_results_classification (
image_id VARCHAR(40) NOT NULL,
c_confidence REAL NOT NULL,
FOREIGN KEY (image_id) REFERENCES images(id)
);
CREATE TABLE image_results_detection (
image_id VARCHAR(40) NOT NULL,
d_confidence REAL NOT NULL,
FOREIGN KEY (image_id) REFERENCES images(id)
);
INSERT INTO images (id) VALUES ('123');
INSERT INTO images (id) VALUES ('456');
INSERT INTO image_results_classification (image_id, c_confidence) VALUES ('123', 0.9);
INSERT INTO image_results_classification (image_id, c_confidence) VALUES ('123', 0.8);
INSERT INTO image_results_classification (image_id, c_confidence) VALUES ('456', 0.7);
INSERT INTO image_results_detection (image_id, d_confidence) VALUES ('123', 0.1);
INSERT INTO image_results_detection (image_id, d_confidence) VALUES ('123', 0.2);
INSERT INTO image_results_detection (image_id, d_confidence) VALUES ('456', 0.3);
这个模式是为这个问题设计的,以帮助简化:两个结果表上还有更多的行,它们也有所不同(不仅仅是信心)。
我想在我的应用层中最终得到的是类型: Map [Image,(List [ClassificationResult],List [DetectionResult])]
即。图像,以及所有结果。 带有空值的结果集会很好。也许是这样的?:
id c_confidence d_confidence
123 0.9 NULL
123 0.8 NULL
123 NULL 0.1
123 NULL 0.2
456 0.7 NULL
456 NULL 0.3
这是来自DB Fiddle的查询:
SELECT *
FROM images INNER JOIN
(SELECT id FROM images LIMIT 10 OFFSET 0
) AS i
ON (images.id = i.id) OUTER LEFT JOIN
image_results_classification c
ON (images.id = c.image_id) OUTER LEFT JOIN
image_results_detection d
ON (images.id = d.image_id);
https://www.db-fiddle.com/f/tuDxwY7kQGfEvZSzaajESG/0
编辑:我需要对结果进行过滤,并能够对图像进行限制和偏移,这是次要要求。
我希望能够执行以下查询:
给我所有图片及其所有结果,这些图片及其c_confidence> 0.5。即如果图像的c_confidence为0.4,则应该包含该图像(并且没有任何结果)。如果它有c_confidence 0.6,然后返回所有结果(包括image_results_detection)。
我已经更新了我的小提琴以反映这一点: https://www.db-fiddle.com/f/tuDxwY7kQGfEvZSzaajESG/1
在小提琴中,我想要没有结果回来,因为图像没有带有置信度的image_results_classification> 0.8
答案 0 :(得分:2)
您可以将GROUP_CONCAT与GROUP BY一起使用 第一个group_concat可以在带有LIMIT的子查询中完成 为了避免这两个一对多关系之间的笛卡尔连接效应。
例如:
SELECT
q.*,
group_concat(d.d_confidence) as d_confidence_list
FROM
(
SELECT i.id, group_concat(c.c_confidence) as c_confidence_list
FROM images i
LEFT JOIN image_results_classification c ON (c.image_id = i.id)
GROUP BY i.id
LIMIT 10
) q
LEFT JOIN image_results_detection d ON (d.image_id = q.id)
GROUP BY q.id, q.c_confidence_list
或者您可以按值使用DISTINCT并在没有子查询的情况下执行此操作
SELECT
i.id,
group_concat(distinct c.c_confidence) as c_confidence_list,
group_concat(distinct d.d_confidence) as d_confidence_list
FROM images i
LEFT JOIN image_results_classification c ON (c.image_id = i.id)
LEFT JOIN image_results_detection d ON (d.image_id = i.id)
GROUP BY i.id
LIMIT 10
但如果对这些连接表有很大的信心,第一种方法可能会更快。
<强>附加强>
此处还有2个问题需要尝试。
第一个应该得到预期的结果 使用CTE,LIMIT只能完成一次。
with TOPIMG as (
select * from images LIMIT 10
)
select image_id, c_confidence, null as d_confidence
from TOPIMG i
join image_results_classification c on c.image_id = i.id
union all
select image_id, null as c_confidence, d_confidence
from TOPIMG i
join image_results_detection d on d.image_id = i.id
order by image_id;
此查询使用一种技巧以迂回方式模仿带有PARTITION的ROW_NUMBER函数。 (我不喜欢它,它会杀死表演)
with TOPIMG as (
select * from images LIMIT 10
)
select
image_id,
max(case when src = 'c' then conf end) as c_conf,
max(case when src = 'd' then conf end) as d_conf
from
(
select image_id, 'c' as src, c_confidence as conf,
(
select count(*)
from image_results_classification c2
where c.image_id = c2.image_id and c.c_confidence >= c2.c_confidence
) as RN
from TOPIMG i
join image_results_classification c on (c.image_id = i.id)
union all
select image_id, 'd', d_confidence,
(
select count(*)
from image_results_detection d2
where d.image_id = d2.image_id and d.d_confidence >= d2.d_confidence
) as RN
from TOPIMG i
join image_results_detection d on (d.image_id = i.id)
) cd
group by image_id, RN
order by image_id, RN;
<强>更新强>
实施特殊酱c_confidence > 0.5
要求:
with IMG as (
select i.id as image_id,
max(case when c.image_id is not null then 1 else 0 end) as show_all
from images i
left join image_results_classification c on (c.image_id = i.id and c.c_confidence > 0.5)
group by i.id
order by i.id
LIMIT 100
)
select c.image_id, 'c' as result_type, c.c_confidence as confidence
from IMG i
join image_results_classification c on c.image_id = i.image_id
where i.show_all = 1
union all
select d.image_id, 'd' as result_type, d.d_confidence as confidence
from IMG i
join image_results_detection d on d.image_id = i.image_id
where i.show_all = 1
union all
select i.image_id, null, null
from IMG i
where i.show_all = 0
order by image_id;
答案 1 :(得分:2)
您将每个分类与每个检测相结合。但这两者并没有真正相关,所以不要这样做。一种解决方案是分别选择分类和检测,并union all
。
select *
from
(
select 'Classification' as what, image_id, c_confidence as value
from image_results_classification
union all
select 'Detection' as what, image_id, d_confidence as value
from image_results_detection
) results
where image_id in
(
select id
from images
-- order by something to decide which images to pick?
limit 10
);
输出:
+ ---------------+----------+-------+ | what | image_id | value | + ---------------+----------+-------+ | Classification | 123 | 0.8 | | Classification | 123 | 0.9 | | Detection | 123 | 0.1 | | Detection | 123 | 0.2 | | Classification | 456 | 0.7 | | Detection | 456 | 0.3 | + ---------------+----------+-------+
DB-fiddle demo:https://www.db-fiddle.com/f/fZPMNL7NC8GzwkwHc4strG/0