我有一张大约有10,000条记录的表格,标题中有一些重复,有些重复超过5次。
样本数据
id| titleslug | views
--------------------
1 |the-box| 200
2 |the-box| 100
3 |the-box| 10
4 |the-man| 15
5 |the-man| 30
6 |the-cup| 10
7 |the-cup| 20
这个方框出现了3次,所以我想离开那个,但是'the-man'和'the-cup'出现了2x我想删除其中一个,以便最终的桌子成为
id| titleslug | views
--------------------
1 |the-box| 200
2 |the-box| 100
3 |the-box| 10
5 |the-man| 30
7 |the-cup| 20
如果可能,我想将已删除的视图计数添加到保留的最高视图计数中。
通过下面的查询,我能够知道项目重复的次数。
select titleslug, count(*) as c from articles
group by titleslug having c > 1
order by c desc
我想删除其中一条只重复两次的记录,剩下的就剩下了。 我正在考虑这个查询,如下所示
DELETE a
FROM articles as a, articles as b
WHERE
(a.titleslug = b.titleslug OR a.titleslug IS NULL AND b.titleslug IS NULL)
AND a.views < b.views;
但是,如果我们有两个重复项,我需要帮助才能将限制删除一个。
我已经使用下面的查询来报告受影响的行但在我查询之后似乎没有删除副本
DELETE a
FROM articles_copy a
JOIN (SELECT MAX(t.Views) AS max_a1, t.TitleSlug
FROM articles_copy t
GROUP BY t.TitleSlug, t.Views
HAVING COUNT(*)>1 AND COUNT(*)<=2) b ON b.TitleSlug = a.TitleSlug
AND b.max_a1 > a.View
答案 0 :(得分:1)
选项可以是(评估性能问题):
mysql> DROP TABLE IF EXISTS `articles`;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE IF NOT EXISTS `articles` (
-> `id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
-> `title` VARCHAR(25) NOT NULL,
-> `views` INT UNSIGNED
-> );
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO `articles`
-> (`title`, `views`)
-> VALUES
-> ('the-box', 200),
-> ('the-box', 100),
-> ('the-box', 10),
-> ('the-man', 15),
-> ('the-man', 30),
-> ('the-cup', 10),
-> ('the-cup', 20);
Query OK, 7 rows affected (0.00 sec)
Records: 7 Duplicates: 0 Warnings: 0
mysql> SELECT
-> `id`,
-> `title`,
-> `views`
-> FROM
-> `articles`;
+----+---------+-------+
| id | title | views |
+----+---------+-------+
| 1 | the-box | 200 |
| 2 | the-box | 100 |
| 3 | the-box | 10 |
| 4 | the-man | 15 |
| 5 | the-man | 30 |
| 6 | the-cup | 10 |
| 7 | the-cup | 20 |
+----+---------+-------+
7 rows in set (0.00 sec)
mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)
mysql> UPDATE `articles`
-> INNER JOIN (
-> SELECT MAX(`id`) `id`, SUM(`views`) `views`
-> FROM `articles`
-> GROUP BY `title`
-> HAVING COUNT(`title`) = 2
-> ) `der`
-> SET `articles`.`views` = `der`.`views`
-> WHERE `articles`.`id` = `der`.`id`;
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> DELETE FROM `articles`
-> WHERE `id` IN (SELECT MIN(`der`.`id`)
-> FROM (SELECT `id`, `title`
-> FROM `articles`) `der`
-> GROUP BY `der`.`title`
-> HAVING COUNT(`der`.`title`) = 2);
Query OK, 2 rows affected (0.00 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT
-> `id`,
-> `title`,
-> `views`
-> FROM
-> `articles`;
+----+---------+-------+
| id | title | views |
+----+---------+-------+
| 1 | the-box | 200 |
| 2 | the-box | 100 |
| 3 | the-box | 10 |
| 5 | the-man | 45 |
| 7 | the-cup | 30 |
+----+---------+-------+
5 rows in set (0.00 sec)