我在SQLite3数据库上遇到一个复杂查询的问题,我认为这与我对如何引用select语句返回的结果表中的列的误解有关,特别是当别名时参与其中。
这是一个示例表 - 电影ID列表,其中每个演员都有一行来处理电影:
CREATE TABLE movie_actor (imdb_id TEXT, actor TEXT);
INSERT INTO movie_actor VALUES('44r4', 'John Doe');
INSERT INTO movie_actor VALUES('44r4', 'Jane Doe');
INSERT INTO movie_actor VALUES('44r4', 'Jermaine Doe');
INSERT INTO movie_actor VALUES('44r4', 'Jacob Doe');
INSERT INTO movie_actor VALUES('55r5', 'John Doe');
INSERT INTO movie_actor VALUES('55r5', 'Jane Doe');
INSERT INTO movie_actor VALUES('55r5', 'Nathan Deer');
INSERT INTO movie_actor VALUES('66r6', 'Bob Duck');
INSERT INTO movie_actor VALUES('66r6', 'John Doe');
INSERT INTO movie_actor VALUES('66r6', 'Jermaine Doe');
INSERT INTO movie_actor VALUES('66r6', 'Jane Doe');
INSERT INTO movie_actor VALUES('77r7', 'John Doe');
我试图找出每对演员在所有电影中互相合作的次数。我决定以自我加入的方式解决这个问题,但遇到了一些问题,我会得到记录对,例如" John Doe,Jane Doe,3"和" Jane Doe,John Doe,3" - 这真是一回事,我只想算第一个版本。这是产生的代码:
SELECT DISTINCT
CASE WHEN d.actor_1 > d.actor_2 THEN d.actor_1 ELSE d.actor_2 END d.actor_1,
CASE WHEN d.actor_2 > d.actor_1 THEN d.actor_2 ELSE d.actor_1 END d.actor_2,
d.v
FROM (
SELECT c.actor_1 AS actor_1, c.actor_2 AS actor_2, COUNT(*) AS v
FROM (
SELECT a.actor AS actor_1, b.actor AS actor_2
FROM movie_actor a JOIN movie_actor b ON a.imdb_id=b.imdb_id
) AS c
WHERE c.actor_1 <> c.actor_2
GROUP BY c.actor_1, c.actor_2
HAVING COUNT(*) > 2
ORDER BY COUNT(*) DESC
LIMIT 20
)
AS d
这不会运行,但我无法弄清楚原因。我的假设是我没有正确使用别名,但我真的不知道。有什么想法吗?
答案 0 :(得分:3)
如果我们添加条件a.actor < b.actor
,我们会得到一个更简单的查询。这不包括与平等参与者配对,同时也不需要交换演员。
SELECT
a.actor AS actor_1, b.actor AS actor_2, COUNT(*) AS v
FROM
movie_actor a
INNER JOIN movie_actor b
ON a.imdb_id = b.imdb_id
WHERE
a.actor < b.actor
GROUP BY a.actor, b.actor
ORDER BY COUNT(*) DESC, a.actor, b.actor
LIMIT 20
注意:SQL总是在加入时创建交叉产品,即它会创建与连接条件匹配的所有可能的记录组合。因此,对于imdb 55r5
(包括3个演员),它将首先生成以下3 x 3 = 9对:
John Doe John Doe
John Doe Jane Doe
John Doe Nathan Deer
Jane Doe John Doe
Jane Doe Jane Doe
Jane Doe Nathan Deer
Nathan Deer John Doe
Nathan Deer Jane Doe
Nathan Deer Nathan Deer
然后WHERE子句排除所有a&gt; = b对,我们得到
John Doe Nathan Deer
Jane Doe John Doe
Jane Doe Nathan Deer
答案 1 :(得分:1)
首先生成不同的对,然后计算它们。
select actor_1, actor_2, count(*)
from (select distinct a.imdb_id, a.actor as actor_1, b.actor as actor_2
from movie_actor a
inner join movie_actor b on a.imdb_id = b.imdb_id
where a.actor < b.actor) x
group by actor_1, actor_2
order by actor_1, actor_2;
actor_1 actor_2 count(*) ---------- ---------- ---------- Bob Duck Jane Doe 1 Bob Duck Jermaine D 1 Bob Duck John Doe 1 Jacob Doe Jane Doe 1 Jacob Doe Jermaine D 1 Jacob Doe John Doe 1 Jane Doe Jermaine D 2 Jane Doe John Doe 3 Jane Doe Nathan Dee 1 Jermaine D John Doe 2 John Doe Nathan Dee 1