Question

我尝试运行此查询：

update table1 A 
set number = (select count(distinct(id)) from table2 B where B.col1 = A.col1 or B.col2 = A.col2);

但它需要永远bc table1有1,100,000行，table2有350,000,000行。

有没有更快的方法在R中执行此查询？还是在python中？

Answer 1

我用三个子查询而不是一个子查询重写了您的查询 - 使用UNION和两个INNER JOIN语句：

UPDATE table1 as A
SET number = (SELECT COUNT(DISTINCT(id)) 
              FROM
                  (SELECT A.id as id
                   FROM table1 as A
                   INNER JOIN table2 as B
                   ON A.col1 = B.col1) -- condition for col1

                   UNION DISTINCT

                  (SELECT A.id as id
                   FROM table1 as A
                   INNER JOIN table2 as B
                   ON A.col2 = B.col2) -- condition for col2
              )

我的笔记：

更新table1中的所有行似乎不太好，因为我们必须触及1.1M行。可能另一种用于存储number的数据结构将具有更好的性能
尝试运行部分查询而不更新table1（仅括号中的部分查询
如果您需要更多通用方法来优化SQL查询，请查看EXPLAIN：https://dev.mysql.com/doc/refman/5.7/en/using-explain.html

有什么更快的方法在R中进行mysql更新查询？在python？

1 个答案: