我有一个项目,每周我会导入几个包含不正确数据的大型数据集,例如重复的员工ID,它们不应该是重复的。为了标记重复项,我尝试以下代码:
ALTER TABLE AccountDuplicates
ADD UNIQUE INDEX EmployeeID (EmployeeID);
INSERT INTO AccountDuplicates
SELECT
EmployeeID,
FirstName,
LastName
FROM AccountsWork
ON DUPLICATE KEY UPDATE
EmployeeID = CONCAT(VALUES(EmployeeID), '*');
INSERT语句给了我错误,我看不出我做错了什么:
[42000][1064] You have an error in your SQL syntax; check the manual that correspondsto your MySQL server version for the right syntax to use near 'FROM EAD_UserAccountsWork
ON DUPLICATE KEY UPDATE EmployeeID = CONCAT(VALUES(E' at line 36
如果它是相关的,我在OS X 10.11.4,INNODB引擎和mysql_mode =''上运行MySQL 5.7.12。我的目的是识别重复的ID,以便我可以将它们转发给相应的DBA进行更正。
更新:我已按如下方式设置数据库默认值:
[client]
default-character-set = utf8mb4
[mysqld]
sql_mode=''
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
[mysql]
default-character-set = utf8mb4
答案 0 :(得分:0)
我认为这是因为您没有正确认定“来源”EmployeeID
INSERT INTO AccountDuplicates
SELECT
EmployeeID,
FirstName,
LastName
FROM AccountsWork t
ON DUPLICATE KEY UPDATE
EmployeeID = CONCAT(t.EmployeeID, '*');
EmployeeID
左侧的=
引用AccountDuplicates
表,右侧的AccountsWork
引用from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
import numpy as np
setup(
ext_modules = cythonize(
[Extension("*",["*.pyx"],
libraries =["MyLib"],
extra_compile_args = ["-fopenmp","-O3"],
extra_link_args=["-L/path/to/lib"])
]),
include_dirs = [np.get_include()],
)
表
答案 1 :(得分:0)
你的语法看起来不错,也许有输入错误? 这几乎有效:
-- drop table AccountsWork ;
-- drop table AccountDuplicates;
CREATE TABLE AccountsWork (
EmployeeID varchar(16),
FirstName INT,
LastName INT
);
CREATE TABLE AccountDuplicates (
EmployeeID varchar(16),
FirstName INT,
LastName INT
);
alter table AccountDuplicates add unique index(EmployeeID);
insert into AccountsWork values('a',2,3);
insert into AccountsWork values(1,2,3);
insert into AccountsWork values('b',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
SELECT
*
FROM
AccountsWork;
-- there is no syntax errors here (your original query):
INSERT INTO AccountDuplicates
SELECT
EmployeeID,
FirstName,
LastName
FROM AccountsWork
ON DUPLICATE KEY UPDATE
EmployeeID = CONCAT(VALUES(EmployeeID), '*');
SELECT
*
FROM
AccountDuplicates;
带有重复键的简单插入有效,但“从重复键更新表中选择”不起作用。
我认为您需要查看此问题:INSERT INTO ... SELECT FROM ... ON DUPLICATE KEY UPDATE
看起来mysql解析器对这种查询感到疯狂