我需要一些帮助。 我创建了我的表结构如下:
CREATE TABLE `my_data` (
`Date` VARCHAR(45) NOT NULL,
`test1` double,
`check1` int,
`test2` double,
`check2` int,
`No` INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(No)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
我的csv数据是5gb及以上的巨大文件。它每秒捕获数据。每秒的数据可能相同,但信息有效。如何导入所有重复项?当我尝试使用下面的命令时,系统不断删除重复项。
LOAD DATA LOCAL INFILE 'D:/mydatatable.csv' INTO TABLE my_data FIELDS TERMINATED BY ',' enclosed by '"' lines terminated by '\n' IGNORE 1 LINES
这是csv的样本记录
<style>
.demo {
border:1px solid #C0C0C0;
border-collapse:collapse;
padding:5px;
}
.demo th {
border:1px solid #C0C0C0;
padding:5px;
background:#F0F0F0;
}
.demo td {
border:1px solid #C0C0C0;
padding:5px;
}
</style>
<table class="demo">
<caption>Table 1</caption>
<thead>
<tr>
<th>date/time</th>
<th>A</th>
<th>B</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>2/23/2015 0:42</td>
<td>3</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td>2/23/2015 0:42</td>
<td>3</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td>2/23/2015 0:42</td>
<td>3</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
CSV数据:
2/23/2015 0:42,3,4,2
2/23/2015 0:42,3,4,2
2/23/2015 0:42,3,4,2
答案 0 :(得分:0)
就我个人而言,我无法重现此问题,您确定您的行已被\n
而不是\r\n
终止吗?
首先,我会尝试将auto_increment
列作为第一列移动。
ALTER TABLE `my_data`
CHANGE COLUMN `NO` `NO` INT(11) NOT NULL AUTO_INCREMENT FIRST;
然后我明确定义列,以便从导入的数据中正确表示它们,而不是暗示。
LOAD DATA LOCAL INFILE 'D:/mydatatable.csv'
IGNORE INTO TABLE `my_data`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES (`DATE`, `test1`, `check1`, `test2`, `check2`);
这将确保您的CSV中是否有任何额外的列数据,它将被忽略,以确保auto_increment
列不会被''
或0
等污染
最终结果
mysql> LOAD DATA LOCAL INFILE 'D:/mydatatable.csv'
-> IGNORE INTO TABLE `test`.`my_data`
-> FIELDS TERMINATED BY ','
-> OPTIONALLY ENCLOSED BY '"'
-> LINES TERMINATED BY '\n'
-> IGNORE 1 LINES (`DATE`, `test1`, `check1`, `test2`, `check2`);
Query OK, 3 rows affected, 3 warnings (0.04 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 3
mysql> SHOW WARNINGS;
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1261 | Row 1 doesn't contain data for all columns |
| Warning | 1261 | Row 2 doesn't contain data for all columns |
| Warning | 1261 | Row 3 doesn't contain data for all columns |
+---------+------+--------------------------------------------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM my_data;
+----+----------------+-------+--------+-------+--------+
| NO | DATE | test1 | check1 | test2 | check2 |
+----+----------------+-------+--------+-------+--------+
| 1 | 2/23/2015 0:42 | 3 | 4 | 2 | NULL |
| 2 | 2/23/2015 0:42 | 3 | 4 | 2 | NULL |
| 3 | 2/23/2015 0:42 | 3 | 4 | 2 | NULL |
+----+----------------+-------+--------+-------+--------+
3 rows in set (0.00 sec)