Question

我有一个计划系统清理的方法，它通过“存储”表中的所有文件，选择我们需要的文件类型（属性照片），然后通过每个文件定义相应的列表是否仍然存在在数据库中。如果没有，从数据库中删除记录并删除文件本身。

现在关于这个问题。最初我没有使用chunk（），它只是模型:: all（）来选择所有内容而且一切运行良好。但此时我在该存储表中有200000条记录，由于内存消耗巨大，这些操作开始崩溃。所以我决定选择chunk（）。

所以问题是现在它应该正常工作，但是，在某些随机时刻（在流程中间的某个地方），代码执行就像操作完成一样停止，因此在任何地方都没有记录错误和任务尚未完全完成。

你能否说一下这种奇怪行为的原因是什么？

public function verifyPhotos() {
    // Instantiating required models and putting them into a single array so they can be passed to a closure
    $models = [];
    $models['storage'] = App::make('Store');
    $models['condo'] = App::make('Condo');
    $models['commercial'] = App::make('Commercial');
    $models['residential'] = App::make('Residential');
    // Obtaining and processing all records from the storage chunk by chunk
    $models['storage']->where('subject_type', '=', 'property_photo')->chunk(10000, function($files) use(&$models) {
        // Going through each record in current chunk       
        foreach ($files as $photo) {
            // If record's subject type is Condo
            if ($photo->subject_name == 'CondoProperty') {
                // Selecting Condo model to work with
                $current_model = $models['condo'];
            // If record's subject type is Commercial
            } elseif ($photo->subject_name == 'CommercialProperty') {
                // Selecting Commercial model to work with
                $current_model = $models['commercial'];
            // If record's subject type is Residential
            } elseif($photo->subject_name == 'ResidentialProperty') {
                // Selecting Residential model to work with
                $current_model = $models['residential'];
            }
            // If linked listing doesn't exist anymore
            if (!$current_model->where('ml_num', '=', $photo->owner_id)->first()) {
                // Deleting a storage record and physical file
                Storage::delete('/files/property/photos/'.$photo->file_name);
                $models['storage']->unregisterFile($photo->id);
            }
        }                                       
    });
}

Answer 1

在Eloquent中使用chunk()将为SQL查询添加限制和偏移量，并为每个块执行它。如果更改数据库中的数据减少查询匹配的行，则由于偏移量，您将跳过下次执行中的匹配行。

即。如果您有9行id = 1...9和subject_type = 'property_photo'并且使用chunk(3, ...)，则生成的查询为：

select * from store where subject_type = 'property_photo' limit 3 offset 0;
select * from store where subject_type = 'property_photo' limit 3 offset 3;
select * from store where subject_type = 'property_photo' limit 3 offset 6;
select * from store where subject_type = 'property_photo' limit 3 offset 9;

如果你在每行的每个块集subject_type = 'something'内，那些行不再匹配，下一个3偏移的查询将有效地跳过接下来的3个匹配。

有可能使用Collection :: each（）闭包，如下所示，尽管它仍然需要将整个结果集加载到集合中：

$models['storage']->where('subject_type', '=', 'property_photo')->get()->each(function ($photo) use (&$models) {
  if ($photo->subject_name == 'CondoProperty') {
    //...
  }
  //...
});

请记住，您还可以运行DB::disableQueryLog();以节省大型数据库操作的内存。

Answer 2

您应该将try ... catch添加到某些可疑代码中，并将异常消息打印到日志文件中。我也曾经发现同样的问题，并最终发现它也与内存消耗有关。

对我来说最可疑的部分是重用模型，$current_model->where()。我怀疑每次查询后都可能无法释放内存。基本上每个查询只应使用一次。有没有理由重复使用它？

尝试更改为$current_model = App::make('YourModel');，而不是通过$models重复使用，看看是否解决了问题。

使用Eloquent chunk（）时，随机代码执行停止

2 个答案: