我正在尝试将非csv数据文件导入MySQL。
1)数据字段以换行符分隔,字段标识符位于每行的开头。
2)有些字段有多个条目
3)并非每条记录都填充了每个字段
4)字段内部存在一些空白行,需要过滤掉
5)记录通常用空行分隔,但也用“X号”
分隔这是一个文件示例,显示了三个记录显示的示例
Number 1
ARTIST BOOM JEFF=SINGER
BACKING MUSICIANS=BAND
COMP BOOM JEFF
DATE 1980
TIME 3.23
FIELD3 FRONT ROW
NOTE LIVE RECORDING
Number 2
ARTIST JOHN LEE=VOCAL
COMP JOHN LEE
TIME 4.20
ID 000000230682
PUBLISHER BLAHBLAH
FIELD3 DAY I RODE THE TRAIN
Number 3
ARTIST BURT DAN=NARRATOR
JOHNS RY=DRUMS
STUDIO BAND=ORCHESTRA
FREE DAN=DIRECTOR
COMP JOHNS RY
DATE 1934
DUR 2.32
ID 000055332
PUBLISHER WEEWAH
SHELF 86000002
FIELD3 EVE OF THE WAR
NOTE FROM HE NARRATION "NO MORE THAT IN
THE FIRST YEARS OF THE SEVENTEENTH CENTURY .."
将此数据导入MySQL的最佳方法是什么?
可以使用LOAD DATA INFILE读取它吗?或者我应该编写一个脚本来去除字段标识符并将其转换为csv格式,然后可以使用LOAD DATA INFILE读取它?
答案 0 :(得分:1)
我宁愿使用sed
将这些转换为INSERT .. SET ...
语句,例如:
INSERT INTO RECORDS SET
ARTIST="BOOM JEFF=SINGER~BACKING MUSICIANS=BAND" ,
COMP="BOOM JEFF" ,
DATE="1980" ,
TIME="3.23" ,
FIELD3="FRONT ROW" ,
NOTE="LIVE RECORDING"
例如用~
替换记录中的新行,然后在SQL的帮助下分析数据。
答案 1 :(得分:0)
从我看到的情况来看,你最好的选择是一个脚本,它将逐行解析数据,使用类似于以下的脚本(使用php):
$lines=explode("\n",file_get_contents('file.name'));
$record=null;
//go through all the lines
foreach($lines as $line) {
//if the line is not empty, add the field to the record
if(trim($line)) {
//I am only processing the field name-you'll have to do the same for equal signs
$pos = strpos($line, ' ');
$fieldName=substr($line,0,$pos;
$fieldValue=substr($line,$pos+1);
$record[$fieldName]=$fieldValue;
}
//if it is a blank line and we have a record, save it
else if ($record) {
//insert the record into the database
insertRecord($record);
//empty the record as the next line is a new record
$record=null;
}
}
function insertRecord($record) {
//to do implement an sql insert statement
}