Question

一个奇怪的短语问题，但在表格结构中稍微容易解释一下。两张桌子：

CREATE TABLE `posts` (
  `id` bigint(20) AUTO_INCREMENT,
  `text` mediumtext,
  PRIMARY KEY (`id`)
);

CREATE TABLE `dictionary` (
  `id` bigint(20) AUTO_INCREMENT,
  `term` varchar(255),
  `definition` varchar(255),
  PRIMARY KEY (`id`),
  UNIQUE KEY `ix_term` (`term`)
);

posts表包含任意文本的大段落。 dictionary表维护术语的映射（即，可能在文本中出现的单个词）及其定义（更广泛的含义）。

一些posts数据的示例：

+----+-----------+
| id | text      |
+----+-----------+
|  1 | foo       |
|  2 | bar       |
|  3 | foo bar   |
|  4 | foobarbaz |
+----+-----------+

一些dictionary数据的示例：

+----+------+--------------------------+
| id | term | definition               |
+----+------+--------------------------+
|  1 | foo  | A foo is a foo.          |
|  2 | bar  | A bar is a bar.          |
|  3 | baz  | A baz is something else. |
|  4 | quux | Who knows.               |
+----+------+--------------------------+

在示例数据中，有一个术语quux的词典条目，不会出现在任何帖子的文本中。我想从字典表中删除这些未使用的行，但由于模式的布局，似乎并不是一种特别有效的方法。

我能够拼凑的最好的是：

DELETE `dictionary` FROM `dictionary`
LEFT JOIN `posts` ON `posts`.`text` LIKE CONCAT('%', `dictionary`.`term`, '%')
WHERE `posts`.`id` IS NULL;

......而且很懒散。我想知道是否有更有效的方法来构建JOIN条件，或者更好的方法来执行LIKE %...%，或者采用完全不同的方法来搜索posts.text跑得更快。

（另外，我认识到，将posts链接到相关dictionary行的多个表将是一种更高效的维护和搜索此数据的方法，但应用程序代码它是什么。）

Answer 1

创建表为select（CTAS）应该比使用join删除更快。

使用CTAS：

where exists

CREATE TABLE dictionary_new AS 从dictionary中选择* 存在的位置（从posts中选择1 posts。text喜欢CONCAT（＆＃39;％＆＃39;，dictionary。term，＆＃ 39;％＆＃39;））

删除原始表

drop table dictionary;
重命名表

RENAME TABLE dictionary_new至dictionary;

4.创建约束

ALTER TABLE  `dictionary` ADD PRIMARY KEY(id);
ALTER TABLE  `dictionary` ADD UNIQUE KEY `ix_term` (`term`)

如何根据文本搜索有效删除表中的行？

1 个答案: