我使用以下函数将csv文件导入mysql:
function csv_2_mysql($source_file, $target_table, $max_line_length=10000) {
if (($handle = fopen("$source_file", "r")) !== FALSE) {
$columns = fgetcsv($handle, $max_line_length, ",");
foreach ($columns as &$column) {
$column = preg_replace('/[^a-z0-9]/i', '', $column);
}
$insert_query_prefix = "INSERT INTO $target_table (".join(",",$columns).")\nVALUES";
while (($data = fgetcsv($handle, $max_line_length, ",")) !== FALSE) {
while (count($data)<count($columns))
array_push($data, NULL);
$query = "$insert_query_prefix (".join(",",quote_all_array($data)).");";
mysql_query($query);
}
fclose($handle);
}
}
function quote_all_array($values) {
foreach ($values as $key=>$value)
if (is_array($value))
$values[$key] = quote_all_array($value);
else
$values[$key] = quote_all($value);
return $values;
}
function quote_all($value) {
if (is_null($value))
return "NULL";
$value = "'" . mysql_real_escape_string($value) . "'";
return $value;
}
问题是,由于在源头切割和合并csv文件,有时标题不在第一行,所以例如它可能看起来像这样:
value1,value2,value3,value4
value1,value2,value3,value4
value1,value2,value3,value4
header1,header2,header3,header4
value1,value2,value3,value4
value1,value2,value3,value4
value1,value2,value3,value4
value1,value2,value3,value4
header1,header2,header3,header4
value1,value2,value3,value4
value1,value2,value3,value4
value1,value2,value3,value4
value1是唯一的,所以我知道除了标题之外永远不会有重复的行。如何调整函数,以便删除重复的标题行(如果存在)并确保剩余的标题行用于$ columns?我只是手动设置列值,除了每个csv可能有不同的列数(header1和value1除外,它们总是存在,因为它是唯一的时间戳)。
更新:
好吧,我想出来但是使用fopen和file_get_contents感觉不对。我会在大型csv上遇到问题吗?
function csv_2_mysql($source_file, $target_table, $uid, $nid, $max_line_length=10000) {
if (($handle = fopen("$source_file", "r")) !== FALSE) {
$handle2 = file_get_contents($source_file) or exit;
$handle_row = explode("\n", $handle2);
foreach ($handle_row as $key => $val) {
$row_array = explode(',', $val);
foreach ($row_array as $key => $val) {
$row_array[$key] = trim(str_replace('"', '', $val));
}
if(!in_array('header1', $row_array)) {
unset ($row_array);
}
else {
$columns = $row_array;
}
}
foreach ($columns as &$column) {
$column = preg_replace('/[^a-z0-9]/i', '', $column);
}
$insert_query_prefix = "INSERT INTO $target_table (".join(",",$columns).")\nVALUES";
while (($data = fgetcsv($handle, $max_line_length, ",")) !== FALSE) {
while (count($data)<count($columns))
array_push($data, NULL);
$query = "$insert_query_prefix (".join(",",quote_all_array($data)).");";
mysql_query($query);
}
fclose($handle);
}
}
function quote_all_array($values) {
foreach ($values as $key=>$value)
if (is_array($value))
$values[$key] = quote_all_array($value);
else
$values[$key] = quote_all($value);
return $values;
}
function quote_all($value) {
if (is_null($value))
return "NULL";
$value = "'" . mysql_real_escape_string($value) . "'";
return $value;
}
答案 0 :(得分:-1)
我猜你可以有一个数组来推送第一列的值(因为你说它是唯一的)并检查它是否重复。如果是重复,则忽略该行并继续。