我想对具有两列具有相同值的大表进行数据计数。但是以最好的方式
我有一个产品关联表,我想计算与另一个产品关联的次数,我想知道哪个关联最紧密,但是我的表太大,我想知道是否有针对该问题的优化解决方案这个
我的桌子:
mysql> SELECT * FROM user_association_data_2019_02;
+----+------------+-------------+---------------------+---------+
| id | product_id | association | last_modified | user_id |
+----+------------+-------------+---------------------+---------+
| 6 | 1096 | 1355 | 2019-02-04 11:42:07 | 2940 |
| 17 | 1096 | 1758 | 2019-02-04 11:54:10 | 2940 |
| 19 | 1355 | 1758 | 2019-02-04 11:54:15 | 2940 |
| 24 | 1096 | 1758 | 2019-02-04 11:55:31 | 2940 |
| 37 | 1355 | 1758 | 2019-02-04 11:58:54 | 2940 |
| 53 | 1096 | 463 | 2019-02-04 16:38:49 | 2940 |
| 56 | 1758 | 560 | 2019-02-05 10:11:43 | 2940 |
| 57 | 1096 | 560 | 2019-02-05 10:11:45 | 2940 |
| 65 | 1096 | 560 | 2019-02-05 11:10:13 | 2940 |
| 70 | 1758 | 560 | 2019-02-05 12:11:50 | 2940 |
| 74 | 1758 | 560 | 2019-02-05 12:13:27 | 2940 |
| 75 | 1207 | 560 | 2019-02-05 12:13:30 | 2940 |
| 77 | 1096 | 560 | 2019-02-05 12:14:17 | 2940 |
| 79 | 1207 | 1355 | 2019-02-05 14:04:17 | 2940 |
| 81 | 1355 | 560 | 2019-02-06 14:17:25 | 2940 |
| 82 | 1096 | 560 | 2019-02-06 14:17:26 | 2940 |
这可以解决我的问题
mysql> SELECT product_id, association, count(*) as total FROM user_association_data_2019_02 GROUP BY product_id, association;
+------------+-------------+-------+
| product_id | association | total |
+------------+-------------+-------+
| 1096 | 1355 | 1 |
| 1096 | 1758 | 2 |
| 1096 | 463 | 1 |
| 1096 | 560 | 4 |
| 1207 | 1355 | 1 |
| 1207 | 560 | 1 |
| 1355 | 1758 | 2 |
| 1355 | 560 | 1 |
| 1758 | 560 | 3 |
+------------+-------------+-------+
但是我不认为这是优化的,如何优化此计数?
答案 0 :(得分:1)
可能没有其他方法可以重写查询。但是您可以通过添加索引来提高性能:
ALTER TABLE t ADD INDEX ix_productid_association (product_id, association);