我有一个名为female
的SQL视图,看起来像下面的
+----------+----------+--------+------+
| actor_id | movie_id | gender | year |
+----------+----------+--------+------+
| 528787 | 2 | M | 1996 |
| 528788 | 2 | F | 1952 |
| 528789 | 1 | M | 2001 |
| 528790 | 3 | M | 1994 |
| 528791 | 2 | F | 2000 |
| 528791 | 3 | F | 2004 |
| 528791 | 4 | F | 2000 |
| 528791 | 5 | F | 2001 |
| 528792 | 4 | F | 1999 |
| 528792 | 6 | F | 2000 |
+----------+----------+--------+------+
...等等
此处actor_id
和movie_id
形成唯一的组合。我需要找到所有只有女性演员才能参与的movie_id
。这意味着我需要排除所有只有男性演员或男女演员都参与过的电影。
预期产量
+----------+----------+--------+------+
| actor_id | movie_id | gender | year |
+----------+----------+--------+------+
| 528791 | 4 | F | 2000 |
| 528791 | 5 | F | 2001 |
| 528792 | 4 | F | 1999 |
| 528792 | 6 | F | 2000 |
+----------+----------+--------+------+
请帮助我了解解决方案以及对此可能进行的查询。
抱歉,对于某些人来说,这似乎太明显了。
给出的答案不正确,因为我已经编写了一个python代码来交叉验证值,并且在那里我得到了18927
的数量。
答案 0 :(得分:1)
grep(paste(delete, collapse="|"), string, invert = TRUE, value = TRUE)
#[1] "jklo" "mnop"
浮现在脑海:
NOT EXISTS
如果只想要电影,而不想要原始行,那么我将使用聚合:
select f.*
from female f
where not exists (select 1
from female f2
where f2.movie_id = f.movie_id and f2.gender = 'M'
);
答案 1 :(得分:1)
您可以尝试以下操作
SELECT *
FROM [female]
WHERE movie_id IN (SELECT movie_id
FROM [female]
GROUP BY movie_id
HAVING Max(gender) = Min(gender)
AND Max(gender) = 'F')
如果视图中有重复项,并且如果您不希望重复项出现在输出中,则可以尝试如下操作。
SELECT distinct actor_id , movie_id , gender , year
FROM [female]
WHERE movie_id IN (SELECT movie_id
FROM [female]
GROUP BY movie_id
HAVING Max(gender) = Min(gender)
AND Max(gender) = 'F')
答案 2 :(得分:1)
对“ M”性别使用“不在”:
SELECT * FROM `test_data`
where movie_id
NOT IN (SELECT movie_id from test_data where gender = 'M')
答案 3 :(得分:0)
尝试一下
SELECT *
FROM female
WHERE gender='F'
AND movieId NOT IN (SELECT movieId FROM female WHERE gender='M');