PHP - 解析并验证文本文件数据并将其导入mysql数据库

时间:2018-05-07 11:44:14

标签: php mysql performance php-resque

我正在使用 php-resque 来解析和验证大文件中的数据,然后将该数据导入到mysql数据库中。

我已经知道LOAD DATA INFILE可用于将文本文件中的行读取到表中,但不会执行任何验证。

我的数据库结构:

ItemsFile 表:

id  filename  fileepath  valid_items invalid_items processed_items  processed

项目表:

id  uid  item file_id created_at

My Resque Job Class如下所示:

php-resque 分叉子进程并实例化ItemsFileProcessor类

  1. setUp()被调用
  2. 执行()调用
  3. /**
    * Read and validate items form a file, and store them in a database.
    */
    
    class ItemsFileProcessor {
    
        //ItemsFile Model instance
        private $items_file = null;
    
        //Item Model instance
        private $item = null;
    
        //retrieved from ItemsFile table.
        private $file = null;
    
        public function __construct() {
            $this->items_file = new ItemsFile();
            $this->item       = new Item();
        }
    
        public function setUp() {
    
            if (isset($this->args['file_id'])) {
    
                //get file from ItemsFile Table by id.
                $this->file = $this->items_file->getFile($this->args['file_id']);
    
                if (empty($this->file)) {
    
                    //End job processing if file does not exist.
                    exit(-1);
    
                }
    
            }
        }
    
        function perform() {
    
            //NodeJs, socket.io, redis, broadcasting system
            EventBroadcaster::broadcast('app-jobs-channel', 'file_processing_started');
    
            $processed_items = 0;
            $valid_items     = 0;
            $invalid_items   = 0;
    
            //item validation class instance
            $item_validator = new ItemValidator();
    
            try {
    
                $tmp_file = new SplFileObject($this->file->filepath);
    
                //Read items from file, and validate each item.
                while ($tmp_file->valid()) {
                    $line = trim($tmp_file->fgets());
                    if ($line !== '') {
                        if ($item_validator->isValid($line, new ItemValidationRule())) {
    
                            //store item in Item table.
                            $this->item->create([
                                    'uid'     => 'foo',
                                    'item'    => $line,
                                    'file_id' => $this->file->id,
                                ]);
    
                            $valid_items++;
    
                        } else {
    
                            $invalid_items++;
    
                        }
    
                        $processed_items++;
    
                    }
                }
    
                //update ItemsFile Table record
                $this->items_file->update(
                    $this->file->id,
                    [
                        'processed_items'  => $processed_items,
                        'valid_items'      => $valid_items,
                        'invalid_items'    => $invalid_items,
                        'processed'        => 'Processed',
                    ]
                );
    
                EventBroadcaster::broadcast('app-jobs-channel', 'file_processing_completed');
    
            } catch (LogicException $exception) {
    
                //broadcast failure.
                EventBroadcaster::broadcast('app-jobs-channel', 'file_processing_failed');
                Logger::getInstance()->log('ProcessContactFile Exception: '.$exception->getMessage(), Logger::LOGTYPE_ERROR);
    
                exit(-1);
    
            }
        }
    
    }
    

    我的问题:

    • 处理文件需要很长时间
    • Mysql必须逐个处理所有插入请求。 LOAD DATA INFILE要快得多。

    我的问题:

    有没有办法对此进行优化,或者以某种方式引入LOAD DATA INFILE。

1 个答案:

答案 0 :(得分:1)

使用PHP管理文件时,可能会遇到很多性能问题。然后,我建议你用SHELL来解析文件并返回一个字符串(代表你所有插入的一般请求)。从现在开始,您只需要执行此请求。

如果不清楚可以提供帮助。