Question

我有两个名为TempTable和AnotherTable的表，它们具有以下定义的结构。另外，我给出了下表中的一些示例行内容。

TempTable定义

CREATE TABLE `TempTable` (
  `ROWNUMBER` bigint(19) NOT NULL DEFAULT '0',
  `email` text,
  `someid` bigint(19) DEFAULT NULL,
  `mappedid` bigint(19) DEFAULT NULL,
  PRIMARY KEY (`ROWNUMBER`),
  KEY `IDX_1` (`email`(100))
) ENGINE=InnoDB DEFAULT CHARSET=utf8

AnotherTable定义

CREATE TABLE `AnotherTable` (
  `primaryid` bigint(19) NOT NULL DEFAULT '0',
  `email` text,
  PRIMARY KEY (`primaryid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql＆gt; Select * from TempTable;

+-----------+----------------------+--------+----------+
| ROWNUMBER | email                | someid | mappedid |
+-----------+----------------------+--------+----------+
|         1 | email1@somewhere.com |    101 |     NULL |
|         2 | email1@somewhere.com |    102 |     NULL |
|         3 | email1@somewhere.com |    103 |     NULL |
|         4 | email1@somewhere.com |    104 |     NULL |
|         5 | email2@somewhere.com |    105 |     NULL |
|         6 | email2@somewhere.com |    106 |     NULL |
|         7 | email2@somewhere.com |    107 |     NULL |
|         8 | email3@somewhere.com |    108 |     NULL |
+-----------+----------------------+--------+----------+
8 rows in set (0.00 sec)

mysql＆gt; Select * from AnotherTable;

+-----------+----------------------+
| primaryid | email                |
+-----------+----------------------+
|       201 | email1@somewhere.com |
|       202 | email1@somewhere.com |
|       203 | email1@somewhere.com |
|       204 | email2@somewhere.com |
+-----------+----------------------+
4 rows in set (0.00 sec)

这里，在TempTable中，列mappedid与AnotherTable上的primaryid相关。我的目标是根据与TempTable和AnotherTable匹配的电子邮件更新TempTable上的mappedid。我需要匹配仅基于“电子邮件”字段。所以，我想要的结果有点如下：

mysql＆gt; Select * from TempTable;

+-----------+----------------------+--------+----------+
| ROWNUMBER | email                | someid | mappedid |
+-----------+----------------------+--------+----------+
|         1 | email1@somewhere.com |    101 |     201  |
|         2 | email1@somewhere.com |    102 |     202  |
|         3 | email1@somewhere.com |    103 |     203  |
|         4 | email1@somewhere.com |    104 |     NULL |
|         5 | email2@somewhere.com |    105 |     204  |
|         6 | email2@somewhere.com |    106 |     NULL |
|         7 | email2@somewhere.com |    107 |     NULL |
|         8 | email3@somewhere.com |    108 |     NULL |
+-----------+----------------------+--------+----------+
8 rows in set (0.00 sec)

这里，201,202,203,204只出现一次，其他未映射的应为空。 TempTable中不应该有任何重复的映射。

注意： 在现实世界中，我认为不建议在AnotherTable上执行选择查询，因为记录数量将以百万计。所以，我正在寻找一种替代/有效的方式更新TempTable中的数据。 TempTable是一个临时表，欢迎临时表上的任意数量的操作。

mysql＆gt; update TempTable inner join AnotherTable on TempTable.email= AnotherTable.email and TempTable.email!='' set TempTable.mappedid=AnotherTable.primaryid WHERE TempTable.mappedid is null;

查询OK，7行受影响（0.01秒）匹配的行数：7已更改：7警告：0

mysql＆gt; Select * from TempTable;

+-----------+----------------------+--------+----------+
| ROWNUMBER | email                | someid | mappedid |
+-----------+----------------------+--------+----------+
|         1 | email1@somewhere.com |    101 |      201 |
|         2 | email1@somewhere.com |    102 |      201 |
|         3 | email1@somewhere.com |    103 |      201 |
|         4 | email1@somewhere.com |    104 |      201 |
|         5 | email2@somewhere.com |    105 |      204 |
|         6 | email2@somewhere.com |    106 |      204 |
|         7 | email2@somewhere.com |    107 |      204 |
|         8 | email3@somewhere.com |    108 |     NULL |
+-----------+----------------------+--------+----------+
8 rows in set (0.00 sec)

我尝试使用内部联接进行上述更新查询。但它在TempTable上创建了重复的mappedid条目，如上所示。要删除冗余条目，我当前的选项是取消所有重复的条目，并根据电子邮件对AnotherTable进行选择。在删除冗余条目后，表格如下所示：

mysql＆gt; Select * from TempTable;

+-----------+----------------------+--------+----------+
| ROWNUMBER | email                | someid | mappedid |
+-----------+----------------------+--------+----------+
|         1 | email1@somewhere.com |    101 |      201 |
|         2 | email1@somewhere.com |    102 |     NULL |
|         3 | email1@somewhere.com |    103 |     NULL |
|         4 | email1@somewhere.com |    104 |     NULL |
|         5 | email2@somewhere.com |    105 |      204 |
|         6 | email2@somewhere.com |    106 |     NULL |
|         7 | email2@somewhere.com |    107 |     NULL |
|         8 | email3@somewhere.com |    108 |     NULL |
+-----------+----------------------+--------+----------+
8 rows in set (0.00 sec)

mysql＆gt; Select * from AnotherTable;

+-----------+----------------------+
| primaryid | email                |
+-----------+----------------------+
|       201 | email1@somewhere.com |
|       202 | email1@somewhere.com |
|       203 | email1@somewhere.com |
|       204 | email2@somewhere.com |
+-----------+----------------------+
4 rows in set (0.00 sec)

然后，我必须做一个“从AnotherTable中选择primaryid，其中email ='email1@somewhere.com'”然后根据ResultSet内容我必须更新TempTable中的mappedid。问题是因为我有2个重复的电子邮件（email1@somewhere.com和email2@somewhere.com），我需要查询AnotherTable 2次。但是，如果重复数量增加到100，那基本上意味着我必须查询已经是100次重表的AnotherTable（BTW电子邮件列将在AnotherTable中编入索引）。我知道这不是正确的解决方案。在处理大量记录时，你能帮助我提出一个有效的解决方案吗？

Answer 1

事实是，email列本身不足以正确加入您的表。此外，每个电子邮件都需要某种位置编号。

SET @n1 := 0, @g1 := NULL;
SET @n2 := 0, @g2 := NULL;

UPDATE temptable t JOIN
(
  SELECT a.rownumber, b.primaryid
    FROM
  (
    SELECT rownumber, email, @n1 := IF(@g1 = email, @n1 + 1, 1) rnum, @g1 := email
      FROM temptable
     ORDER BY email, rownumber
  ) a LEFT JOIN
  (
    SELECT primaryid, email, @n2 := IF(@g2 = email, @n2 + 1, 1) rnum, @g2 := email
      FROM anothertable
     ORDER BY email, primaryid
  ) b 
      ON a.email = b.email 
     AND a.rnum = b.rnum
   WHERE b.primaryid IS NOT NULL
) s 
    ON t.rownumber = s.rownumber
   SET t.mappedid = s.primaryid;

这是 SQLFiddle 演示

更新问题使用join进行查询

1 个答案: