使用LOAD DATA INFILE跳过第一列

时间:2016-07-03 13:18:06

标签: mysql

我有这样的表:

mysql> show create table final\G;
*************************** 1. row ***************************
       Table: final
Create Table: CREATE TABLE `final` (
  `id` int(4) NOT NULL AUTO_INCREMENT,
  `cdatetime` varchar(255) NOT NULL,
  `address` varchar(255) NOT NULL,
  `district` varchar(255) NOT NULL,
  `beat` varchar(255) NOT NULL,
  `grid` varchar(255) NOT NULL,
  `crimedescr` varchar(255) NOT NULL,
  `ucr_ncic_code` varchar(255) NOT NULL,
  `latitude` varchar(255) NOT NULL,
  `longitude` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

我的csv文件看起来像这样:

cdatetime,address,district,beat,grid,crimedescr,ucr_ncic_code,latitude,longitude
1/1/06 0:00,3108 OCCIDENTAL DR,3,3C        ,1115,10851(A)VC TAKE VEH W/O OWNER,2404,38.55042047,-121.3914158
1/1/06 0:00,2082 EXPEDITION WAY,5,5A        ,1512,459 PC  BURGLARY RESIDENCE,2204,38.47350069,-121.4901858
1/1/06 0:00,4 PALEN CT,2,2A        ,212,10851(A)VC TAKE VEH W/O OWNER,2404,38.65784584,-121.4621009
1/1/06 0:00,22 BECKFORD CT,6,6C        ,1443,476 PC PASS FICTICIOUS CHECK,2501,38.50677377,-121.4269508

我想要做的是将CSV文件加载到table final中。问题是csv文件没有ID列,所以我想有可能以某种方式告诉mysql跳过列ID并将数据加载到其余列中,但必须使用ID 。理想情况下,它看起来像这样:

" 1/1/06 0:00,3108 OCCIDENTAL DR,3,3C,1115,10851(A)VC TAKE VEH W / O OWNER,2404,38.55042047,-121.3914158"加载到列中,mysql自动将1添加到列ID,然后" 1/1/06 0:00,2082 EXPEDITION WAY,5,5A,1512,459 PC BURGLARY RESIDENCE,2204,38.47350069,-121.4901858&# 34;得到加载和mysql添加2到ID列等等。

最近用户'影子'告诉我,我应该指定我要加载的列,所以我做了类似的事情:

load data infile '/SacramentocrimeJanuary2006.csv' INTO TABLE final (cdatetime, address, district, beat, grid, crimedescr, ucr_ncic_code, latitude, longitude);

Mysql返回:

ERROR 1261 (01000): Row 1 doesn't contain data for all columns

根据mysql加载数据infile手动字段分隔符不是","所以我试图通过添加FIELDS TERMINATED BY'来改变它。在我的陈述结束但这打破了查询。这里的语法是什么?

由于

ANSWER

mysql> CREATE TABLE `final` (
    ->   `id` int(4) NOT NULL AUTO_INCREMENT,
    ->   `cdatetime` longtext  NULL,
    ->   `address` longtext  NULL,
    ->   `district` longtext  NULL,
    ->   `beat` longtext  NULL,
    ->   `grid` longtext  NULL,
    ->   `crimedescr` longtext  NULL,
    ->   `ucr_ncic_code` longtext  NULL,
    ->   `latitude` longtext  NULL,
    ->   `longitude` longtext  NULL,
    ->   PRIMARY KEY (`id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.17 sec)

mysql> LOAD DATA infile '/SacramentocrimeJanuary2006.csv'  INTO TABLE final FIELDS TERMINATED BY ',' lines terminated by '\r' IGNORE 1 ROWS (cdatetime, address, district, beat, grid, crimedescr, ucr_ncic_code, latitude, longitude);
Query OK, 7584 rows affected (0.08 sec)
Records: 7584  Deleted: 0  Skipped: 0  Warnings: 0

3 个答案:

答案 0 :(得分:1)

我认为您需要添加enclosed byignore rows指令

LOAD DATA infile '/SacramentocrimeJanuary2006.csv' 
INTO TABLE final 
(cdatetime, address, district, beat, grid, crimedescr, ucr_ncic_code, latitude, longitude); 
FIELDS TERMINATED BY ',' 
ENCLOSED BY ''
IGNORE 1 ROWS;

答案 1 :(得分:1)

Linux的:

LOAD DATA INFILE '/home/frank/try_this123.txt'
INTO TABLE final
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(cdatetime, address,district,beat,grid,crimedescr,ucr_ncic_code,latitude,longitude)
set id = NULL;

或Windows:

LOAD DATA INFILE 'c:\\nate\\try_this123.txt'
INTO TABLE final
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(cdatetime, address,district,beat,grid,crimedescr,ucr_ncic_code,latitude,longitude)
set id = NULL;

mysql> select * from final;
+----+-------------+---------------------+----------+------------+------+-------------------------------+---------------+-------------+---------------+
| id | cdatetime   | address             | district | beat       | grid | crimedescr                    | ucr_ncic_code | latitude    | longitude     |
+----+-------------+---------------------+----------+------------+------+-------------------------------+---------------+-------------+---------------+
 | 1 | 1/1/06 0:00 | 3108 OCCIDENTAL DR  | 3        | 3C         | 1115 | 10851(A)VC TAKE VEH W/O OWNER | 2404          | 38.55042047 | -121.3914158
 | 2 | 1/1/06 0:00 | 2082 EXPEDITION WAY | 5        | 5A         | 1512 | 459 PC  BURGLARY RESIDENCE    | 2204          | 38.47350069 | -121.4901858
 | 3 | 1/1/06 0:00 | 4 PALEN CT          | 2        | 2A         | 212  | 10851(A)VC TAKE VEH W/O OWNER | 2404          | 38.65784584 | -121.4621009
 | 4 | 1/1/06 0:00 | 22 BECKFORD CT      | 6        | 6C         | 1443 | 476 PC PASS FICTICIOUS CHECK  | 2501          | 38.50677377 | -121.4269508
+----+-------------+---------------------+----------+------------+------+-------------------------------+---------------+-------------+---------------+

我没有任何封闭的分界,如单引号或双引号。问题是,当你的地址有逗号时会发生什么,并且它会因为一个转移问题而抛弃你的所有数据。

这就是为什么,理想情况下(读取:几乎绝对),除非您的数据是由您生成的,否则您需要包含在双引号中的数据,而且几乎过于简单,例如:

1,2,cat,14,8

因此,对于第三方系统,当无法控制数据的输入方式时,人们必须先编写ETL例程来清理数据,以便为导入数据做好准备-safe包装。

答案 2 :(得分:1)

使用以下格式:

load data infile '/SacramentocrimeJanuary2006.csv' INTO TABLE final (cdatetime, address, district, beat, grid, crimedescr, ucr_ncic_code, latitude, longitude)
fields terminated by ',' 
lines terminated by '\r\n'
ignore 1 lines;