我有一个包含3列的表:id,句子和语言。所以句子可以是英语和德语,ID分配给具有相同含义但语言不同的句子,如
ID | sentence | language
1 | Hello | en
1 | Hallo | de
2 | Sorry | en
可能只有一种语言存在句子。现在我想找出所有可用两种语言的句子,我可以用:
SELECT
*
FROM
`sentences`
WHERE
LENGTH(sentence) > 0
AND (language = 'en' OR language = 'de')
GROUP BY id
HAVING COUNT(language) = 2
我只用德语得到句子的结果。然后我做
SELECT
*
FROM
sentences
WHERE
id IN (SELECT
id
FROM
`sentences`
WHERE
LENGTH(sentence) > 0
AND (language = 'en' OR language = 'de')
GROUP BY id
HAVING COUNT(language) = 2)
这应该有效,但查询需要永远。我的问题:有没有任何奇特的方法可以做到这一点?
答案 0 :(得分:2)
INNER JOINS比使用IN子句
更快SELECT en.id,
en.sentence as en_sentence,
de.sentence as de_sentence,
en.language as en_language,
de.language as de_language
FROM sentences en
INNER JOIN sentences de ON en.ID = de.ID AND en.language = 'en' AND de.language = 'de'
WHERE length(en.sentence) > 0
AND length(de.sentence) > 0
答案 1 :(得分:1)
如果您的数据允许,请删除长度为0的句子。运行前备份:
DELETE FROM sentences WHERE LENGTH(SENTENCE) = 0
取出选择*,并获得您想要的一切。如果您没有索引,请在语言和ID上添加组合索引。
这将离开你
SELECT
ID, sentence, language.
FROM
`sentences`
WHERE
language = 'en' OR language = 'de'
GROUP BY id
HAVING COUNT(language) = 2