我要导入一个具有以下结构的表中的大型csv(接近100mb):
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(11) unsigned | NO | PRI | NULL | auto_increment |
| cep | varchar(255) | YES | MUL | NULL | |
| site | text | YES | | NULL | |
| cidade | text | YES | | NULL | |
| uf | text | YES | | NULL | |
| cepbase | text | YES | | NULL | |
| segmentacao | text | YES | | NULL | |
| area | text | YES | | NULL | |
| cepstatus | int(1) | YES | | NULL | |
| score | int(11) | NO | | NULL | |
| fila | int(11) | NO | | NULL | |
+-------------+------------------+------+-----+---------+----------------+
我正要编写一些要导入的代码,但是我找到了一个对我有用的MySQL命令。所以我写了以下内容:
LOAD DATA LOCAL INFILE '/Users/user/Downloads/base.csv'
INTO TABLE cep_status_new
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(@id,@cep,@site,@cidade,@uf,@cepbase,@segmentacao,@area,@cepstatus,@score,@fila)
SET id=NULL, cep=@col1, site='GOD', cidade=@col6, uf=@col7, cepbase='-', segmentacao=@col9, cepstatus=@col2, area='BING', score=99999, fila=5;
要尝试此代码,我从CSV中删除了1000行,只保留了2行:标头和输入示例:
cep,status,gang,bang,random,mock,awesome,qwert,hero
01019000,0,00387,00388,3550308,SAO PAULO,SP,011,B2
代码运行没有问题,但是我的插入内容很奇怪:
mysql> select * from cep_status_new;
+----+------+------+--------+---------+---------+-------------+------+-----------+-------+------+
| id | cep | site | cidade | uf | cepbase | segmentacao | area | cepstatus | score | fila |
+----+------+------+--------+---------+---------+-------------+------+-----------+-------+------+
| 1 | 1 | GOD | 24655 | 3554805 | - | SP | BING | 0 | 99999 | 5 |
+----+------+------+--------+---------+---------+-------------+------+-----------+-------+------+
1 row in set (0.01 sec)
为什么CSV值不能正确填充?
答案 0 :(得分:1)
根据此specification,IGNORE 1 ROWS
之后的列列表决定了CSV文件的列如何映射到表的列。它可以按文件顺序列出表列,也可以将文件列加载到变量中。使用列列表
(@id,@cep,@site,@cidade,@uf,@cepbase,@segmentacao,@area,@cepstatus,@score,@fila)
您正在将CSV文件的11列加载到名为“ id”,“ cep”等的变量中。然后,在SET
语句中,需要声明如何从变量构造表的列。对于给定的语句,您引用的变量@col1
等未在任何地方定义,因此具有未定义的值。
更正后的说法(我很遗憾现在无法自我测试)应该是:
INTO TABLE cep_status_new
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(@col1,@col2,@col3,@col4,@col5,@col6,@col7,@col8,@col9)
SET id=NULL, cep=@col1, site='GOD', cidade=@col6, uf=@col7, cepbase='-', segmentacao=@col9, cepstatus=@col2, area='BING', score=99999, fila=5;