Symfony中对Excel的大型Excel导入速度非常慢

时间:2017-05-03 11:24:19

标签: php excel symfony doctrine

我有一个脚本导入一个包含大量foreach es的大型Excel文件,经过50次迭代后,它变得难以忍受......我能以某种方式改进吗?

我试着用它来尽可能地读取它:

foreach worksheet (approx 20) {
    NEW DB ENTRY, PERSIST, FLUSH (account)
    foreach row (10-100){
        NEW DB ENTRY, PERSIST, FLUSH (object)
        foreach column (approx. 10){
            CREATE NEW DB ENTRY, FOREIGN KEY to 'object', PERSIST, FLUSH (weekdates)
        }
        foreach column (approx. 50){
            CREATE NEW DB ENTRY, FOREIGN KEY to 'object', PERSIST, FLUSH (scheduleEntry)

            CREATE NEW DB ENTRY, FOREIGN KEY to 'scheduleEntry', PERSIST, FLUSH (scheduleObject)

            CREATE NEW DB ENTRY, FOREIGN KEY to 'scheduleObject', PERSIST, FLUSH (scheduleModule)

           /* WORST CASE IS THAT HERE WE HAVE FLUSHED 100000 times */
        }
    }
}

有没有办法加紧特别是最后一个foreach?我想我每次都需要冲洗,因为我必须将FOREIGN KEY换成新的一个,我是对的吗?我认为excel文件需要24个多小时才能导入。它有关于示例中的数字。

实际(仍然简化)代码看起来像这样

/* Create Excel */
$excel = $this->getContainer()->get('phpexcel')->createPHPExcelObject(Constants::FULL_PATH . 'excel/touren_' . $filename . '.xls');
$sheets = $excel->getAllSheets();
foreach ($sheets as $id => $sheet) {
    $ws = $sheet->toArray();

    /* Read sth from first line and create an 'account' from this */
    $n = new Network();
    ....
    $em->persist($n);

    try {
        $em->flush();
        $output->writeln('----><info>Inserted in DB</info>');
    } catch (Exception $e) {
        $output->writeln('----><error>DB ERROR</error>');
    }

    /* Go through all rows of current WorkSheet */
    foreach ($ws as $row) {
        /* Create new Object */
        $object = new Object();
        ...
        $em->persist($object);

        try {
            $em->flush();
            $output->writeln("------->Save Object to DB: <info>OK</info>");
        } catch (\Exception $e) {
            $output->writeln("------->Save Object to DB: <error>Failed: " . $e->getMessage() . "</error>");
        }

       /* Create new Tour for weekday/client */
       $tour = new Tour();
       $tour->setNetwork($n);

      /* More foreach */
      foreach ($clientKey as $filialNo => $filialKey) {
          $tourObject = new TourObject();
          $tourObject->setTour($tour);
          $tourObject->setObject($o);
          $em->persist($tourObject);


         /* Count Intervals */
        foreach ($filialKey as $tasks) {
            if (!$tourObject->getModule()->contains($module)) {
                $tourObject->addModule($module);
                $em->persist($tourObject);

                /* More foreach */
                foreach ($period as $date) {
                    $schedule = new Schedule();
                    $schedule->setTour($tour);
                    ....
                    $em->persist($schedule);
                    try {
                        $em->flush();
                        $output->writeln("------->Save Schedule to DB: <info>OK</info>");
                    } catch (\Exception $e) {
                        $output->writeln("------->Save Schedule to DB: <error>Failed: " . $e->getMessage() . "</error>");
                    }


                    $scheduleObject = new ScheduleObject();
                    $scheduleObject->setSchedule($schedule);
                    ....
                    $em->persist($scheduleObject);
                    try {
                        $em->flush();
                        $output->writeln("------->Save ScheduleObject to DB: <info>OK</info>");
                    } catch (\Exception $e) {
                        $output->writeln("------->Save ScheduleObject to DB: <error>Failed: " . $e->getMessage() . "</error>");
                    }

                    $scheduleObjectModule = new ScheduleObjectModule();
                    $scheduleObjectModule->setScheduleObject($scheduleObject);
                    $em->persist($scheduleObjectModule);
                    try {
                        $em->flush();                                                               
                        $output->writeln("------->Save ScheduleObjectModule to DB: <info>OK</info>");
                    } catch (\Exception $e) {
                        $output->writeln("------->Save ScheduleObjectModule to DB: <error>Failed: " . $e->getMessage() . "</error>");
                    }
                }
            }
        }
      }

      /* Flush all?!? */
      try {
            $em->flush();
            $output->writeln("------->Save Task to DB: <info>OK</info>");
      } catch (\Exception $e) {
            $output->writeln("------->Save Task to DB: <error>Failed: " . $e->getMessage() . "</error>");
      }
    }

2 个答案:

答案 0 :(得分:2)

您通过EntityManager创建/持久存储的每个实体都存储在UnitOfWork中,现在变成了#34;托管&#34;实体。如果这个UnitOfWork填满,它在系统上相当沉重。您可以在每个&#34;表格后调用$ entityManager-&gt; clear()&#34;这样每次迭代后UoW都会被清除。

每个实体都有自己的UnitOfWork,您可以单独清除每个实体的UoW,但由于您创建了大量实体,我建议不要指定实体类,只清除所有实体。

  ...
  /* Flush all?!? */
  try {
        $em->flush();
        $em->clear();
        $output->writeln("------->Save Task to DB: <info>OK</info>");
  } catch (\Exception $e) {
        $output->writeln("------->Save Task to DB: <error>Failed: " . $e->getMessage() . "</error>");
  }

或者您可以使用本机查询在数据库中插入,但在数据一致性等方面可能并不总是如此。

同样如上所述,您不需要在每个实体之后进行刷新。如果你只调用一次刷新,在每个工作表之后,Doctrine会立即执行所有插入语句。

答案 1 :(得分:0)

我认为一个好的解决方案是使用本机数据库实用程序(如Mysql Load data infile

这比你在PHP中编写的任何内容要快得多。