我有一个包含文字文件中的字词的大表格(offset_1
只是offset
- 1):
file offset offset_1 word
---- ------ -------- ----
1.txt 1 0 I
1.txt 2 1 have
1.txt 3 2 a
1.txt 4 3 large
1.txt 5 4 table
1.txt 6 5 that
1.txt 7 6 contains
我希望在给定距离或更短的距离内找到成对的单词。例如," a"和"表"中间至多有1个字。
我现在所做的是(在MySQL中):
SELECT t1.offset, t3.offset
FROM t as t1 JOIN t as t2 JOIN t as t3
ON t2.file = t1.file AND t3.file = t1.file AND
(
(t1.offset = t2.offset_1 AND t2.offset = t3.offset_1) # "a large table"
OR (t1.offset = t3.offset_1 AND t2.offset = 1) # "a table"
)
WHERE t1.word = 'a' AND t3.word = 'table'
但这永远不会终止(表格很大)。
如果我删除了OR中的两个条件中的任何一个,它就能正常找到"一个大表"或者"表格"分别。
正确的方法是什么?
答案 0 :(得分:1)
这会有效吗
SELECT t1.offset, t2.offset
FROM t as t1
JOIN t as t2 ON t2.file = t1.file
WHERE t1.word = 'a' AND t2.word = 'table'
AND (t2.offset - t1.offset) <= 2
答案 1 :(得分:1)
我建议使用union all
将其拆分为两个查询。这样的事情:
SELECT t1.offset, t3.offset
FROM t t1 JOIN
t t2
ON t2.file = t1.file AND t1.offset = t2.offset_1
WHERE t1.word = 'a' AND t2.word = 'table'
UNION ALL
SELECT t1.offset, t3.offset
FROM t t1 JOIN
t t2
ON t2.file = t1.file AND t1.offset = t2.offset_1 JOIN
t t3
ON t3.file = t2.file and t2.offset = t3.offset_1
WHERE t1.word = 'a' AND t3.word = 'table';
OR
条件下的{p> JOIN
通常会对效果产生不良影响。有时将逻辑分成多个子查询可能是一个很大的胜利