PHP:在将数据从CSV导入数据库时​​删除 等特殊字符

时间:2017-08-03 07:35:20

标签: php csv str-replace

我创建了一个PHP脚本,允许我从csv文件上传大量数据。导入时,我想将 这样的特殊字符替换为字母 c 。以下是我的代码:

        $sql ="INSERT INTO bill_of_materials(allotment_code, category_name, activity, quantity, end_unit_quantity, unit, description,
        unit_cost, regular_labor_cost, end_unit_labor_cost, type, batch) VALUES";

        while (($line = fgets($handle)) !== false) {

          $sql .= "('".implode("', '", explode(";", sanitize($line)))."'),";
          $counter++;
        }

            $sql = substr($sql, 0, strlen($sql) - 1);
             if (mysqli_query($new_conn, $sql) === TRUE) {

                echo 1;

                //database file name
                $new_database_file = $new_database.'.sql';

                if(file_exists('backup/'.$new_database_file)) {

                    unlink('backup/'.$new_database_file);

                    // backup main database

                    $command = "C:/xampp/mysql/bin/mysqldump --host=$host --user=$user --password=$pass $database_name > backup/$new_database_file";
                    system($command);

                } else {
                    // backup main database

                    $command = "C:/xampp/mysql/bin/mysqldump --host=$host --user=$user --password=$pass $database_name > backup/$new_database_file";
                    system($command);
                }
            } else {
                echo $sql;
            }

另外,我的CSV中有一个数据是 W2-A1 2 / FFrontFa ade - B ,我希望看到像 W2-A1这样的输出前台2楼 - B 。我怎么能这样做?

1 个答案:

答案 0 :(得分:-1)

首先,确保使用正确的database client charset collation。 如果数据库charset / collat​​ion正确,您可以使用preg_replace来清理脏字符,如下所示:

function sanitize($line){
   $clean = iconv('UTF-8', 'ASCII//TRANSLIT', $line); // attempt to translate similar characters
   $clean = preg_replace('/[^\w]/', '', $clean); // drop anything but ASCII
   return $clean;
}

如果这样做无济于事(例如,您的二进制流确实已损坏 - 例如从旧的Excel源文件保存为CSV),您可能不会使用二进制翻译字符(首先必须找出损坏的二进制序列,例如通过转储它通过chr(ord($line[$position]))) - 示例:

function sanitize($line){
    $map = [
        // corrupted chars sequence -> fixed chars
        "\xC3\xA8" => 'č',
        "\xC3\x88" => 'Č',
        "\xC3\xB9" => 'ů',
        "\xC3\x99" => 'Ů',
        "\xC3\xAC" => 'ě',
        "\xC3\x8C" => 'Ě',
        "\xC3\xB8" => 'ř',
        "\xC3\x98" => 'Ř',
        "\x53\xC2\x8D" => 'Š',
        "\xC2\xA9" => 'Š',
    ];
    return str_replace(array_keys($map), $map, $line);
}