php preg_split似乎在大字符串上崩溃

时间:2012-11-15 18:41:22

标签: php large-files preg-split

我一直在尝试将USDA营养数据从ascii version of the data加载到mysql数据库中。我使用了来自here的脚本,除了大小约为34MB的NUT_DATA.txt文件之外,所有内容都进来了。使用php脚本正确加载文件(因此内存限制应该不是问题,无论如何它们都设置为128M)使用file_get_contents函数。但是,当它到达preg_split('#\n#', $file)行时,脚本将结束而无需进一步评论。如果我将文件分成更小的部分,它运行正常。为什么它不能在完整文件上运行?我将memory_limit设置为128M,并将upload_max_filesize设置为128M。整个文件似乎已被读入,只是preg_split似乎有问题。

我有什么明显的遗失吗?

谢谢!

这是我一直在测试的PHP代码:

<?php
// from http://drupal.org/node/107806

//$dbh=mysql_connect(/*db info*/);
//mysql_select_db(/*db info*/);
try
{
  $dbh = new PDO('mysql:host=127.0.0.1;dbname=XXXXXXX', 'XXXXXXX', 'XXXXXXX');
  echo "PDO access worked! <br />";
} catch (PDOException $e)
{
  echo "PDO error: " . $e->getMessage() . "<br />";
  die();
}

//List of SR20 filenames with associated field counts.
$filenames = array(//"DERIV_CD" => 2,
//        "FD_GROUP" => 2,
//        "NUTR_DEF" => 6,
//        "SRC_CD" => 2,
//        "WEIGHT" => 7,
//        "FOOTNOTE" => 5,
//        "DATA_SRC" => 9,
//        "DATSRCLN" => 3,
//        "FOOD_DES" => 14,
        "NUT_DATA" => 17
//        "NUT_DATA2" => 17,
//        "NUT_DATA_test" => 17
     );
foreach($filenames as $filename => $count) {
    echo "inside foreach: $filename ";
    $file = file_get_contents($filename.'.txt');
    echo " read in the file, length = " . strlen($file) . "<br />";
    $lines = preg_split('#\n#', $file);
    echo "made it past preg_split <br />";
//    print_r($lines);
    echo " lines = " . count($lines). "<br />";
    for($i=0;$i<count($lines);++$i) {
        $line = $lines[$i];
//        echo "Inside for loop, line = '" . $line . "'<br />";
        //Some text fields are split over several lines. Concatenate them.
        while(substr_count($line, '^') < $count-1 || (substr_count($line, '~')%2) == 1) {
            ++$i;
            if($i>=count($lines)) { break; }
            $line .= '<br />'.$lines[$i];
        }
        $fields = trim($line);
        if(strlen($fields) == 0) continue;
        $fields = str_replace(array("'", '~', '^'), array("''", "'", ','), $fields);
        $sql = "INSERT INTO `$filename` VALUES($fields);";
        //Insert zeroes for unfilled values.
        $sql = str_replace(array(',,', ',)'), array(',0,', ',0)'), $sql);
        $sql = str_replace(array(',,', ',)'), array(',0,', ',0)'), $sql);
        $sql = str_replace(array(',,', ',)'), array(',0,', ',0)'), $sql);
        //Some single-char fields aren't quoted.
        $sql = preg_replace('#,([A-Z])\);$#', ",'$1');", $sql);
// comment out the database interface until things are working
/*        $stmt = $dbh->prepare($sql);
        if(!$stmt->execute()) 
        {
            echo $sql;
        }
*/
    }
}
echo "done";
?>

2 个答案:

答案 0 :(得分:0)

我在php的顶部添加了以下两行:

ini_set('display_errors', 'On');
error_reporting(E_ALL | E_STRICT);

它告诉我它已经耗尽内存。再次提升内存(至256M)解决了这个问题。我想我无法猜出任务将使用多少内存。

答案 1 :(得分:-1)

将此配置添加到httpd.conf

ThreadStackSize 8388608