我知道How to read large worksheets from large Excel files (27MB+) with PHPExcel?并且我已经尝试实施该问题中讨论的分块阅读,但我仍然遭受OOM错误的困扰。文件本身不到5Mb,9000+行(是的,它超过9000!),范围从A到V.
我希望用户在上传和处理之前不要对此文件进行任何编辑,因为目前它只是一个手动过程而且我喜欢完全用自动化替换它。该文件为xls格式,通过PHPExcel标识为Excel5。
我的PHP内存限制目前设置为128M,在Ubuntu Server上运行。
无论我设置什么样的块大小,我最终都会结束OOM。如果我将块大小设置为200,那么实际上运行得更好(例如,我可以管理到第7000行),当设置为1时,OOM在第370行附近。所以我相信&#39 ;东西'正在存储,或在块读取的每次迭代中加载到内存中,然后不再丢弃,最终导致OOM,但我无法看到这种情况发生在哪里。
我非常喜欢业余程序员,这只是我在工作中担任管理服务角色的一面,试图让我们的生活更轻松。
此代码的重点是阅读excel文件,过滤掉“废话”。然后将其保存为CSV(现在我只是将其转储到屏幕而不是CSV)。事情正在发生,我很想通过php脚本调用excel2csv然后尝试清理CSV而不是......但是当我可能更接近解决方案时,感觉就像放弃了。
<?php
error_reporting(E_ALL);
set_time_limit(0);
date_default_timezone_set('Europe/London');
require_once 'Classes/PHPExcel/IOFactory.php';
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
private $_startRow = 0;
private $_endRow = 0;
private $_columns = array();
/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize, $columns) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
$this->_columns = $columns;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow$
if ($row >= $this->_startRow && $row < $this->_endRow) {
if(in_array($column,$this->_columns)) {
return true;
}
}
return false;
}
}
$target_dir = "uploads/";
$file_name = $_POST["file_name"];
$full_path = $target_dir . $file_name;
echo "Processing ". $file_name . '; <br>';
ob_flush();
flush();
/** /** As files maybe large in memory, use a temp file to handle them
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
**/
$inputFileName = $full_path;
echo 'Excel reader started<br/>';
/** First we should get the type of file **/
$filetype = PHPExcel_IOFactory::identify($inputFileName);
echo 'File of type: ' . $filetype . ' found<br/>';
/** Load $inputFileName to a PHPExcel Object - https://github.com/PHPOffice/PHPExcel/blob/develop/$
/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 1;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();
$objReader = PHPExcel_IOFactory::createReader($filetype);
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);
/** Loop to read our worksheet in "chunk size" blocks **/
for ($startRow = 2; $startRow <= 65000; $startRow += $chunkSize) {
$endRow = $startRow+$chunkSize-1;
echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ',$startR$
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize,range('A','T'));
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
// Do some processing here
// $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
$sheetData = $objPHPExcel->getActiveSheet()->rangeToArray("A$startRow:T$endRow");
var_dump($sheetData);
// Clear the variable to not go over memory!
$objPHPExcel->disconnectWorksheets();
unset ($sheetData);
unset ($objPHPExcel);
ob_flush();
flush();
echo '<br /><br />';
}
/** This loads the entire file, crashing with OOM
try {
$objPHPExcel = PHPExcel_IOFactory::load($inputFileName);
echo 'loaded sheet into memory<br>';
} catch(PHPExcel_Reader_Exception $e) {
die('Error loading file: '.$e->getMessage());
}
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
echo 'Saving sheet as CSV<br>';
$objWriter->setSheetIndex(0);
$objWriter->save('./uploads/'.$file_name.'.csv');
echo 'Processed 1 sheet';
ob_flush();
flush();
**/
echo "<body><table>\n\n";
/**
$f = fopen($file_name, "r");
while (($line = fgetcsv($f)) !== false) {
echo "<tr>";
foreach ($line as $cell) {
echo "<td>" . htmlspecialchars($cell) . "</td>";
}
echo "</tr>\n";
}
fclose($f);
**/
echo "\n</table></body></html>";
?>
apache日志中指示的错误是:
[Fri Mar 31 15:35:27.982697 2017] [:error] [pid 1059] [client 10.0.2.2:53866] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 45056 bytes) in /var/www/html/Classes/PHPExcel/Shared/OLERead.php on line 93, referer: http://localhost:8080/upload.php
答案 0 :(得分:1)
unset ($objPHPExcel);
如果您检查PHPExcel documentation,由于电子表格,工作表和单元格之间存在循环引用,因此无法完全取消设置$ objPHPExcel,并且会导致内存泄漏。建议首先断开这些循环引用。
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);
仍会有一些内存泄漏,但它应该允许在块之间释放更多内存