我正在使用faker获取虚拟数据并尝试添加100万条记录。不知怎的,我只能达到大约100000行,以下是我的代码
$no_of_rows = 1000000;
for( $i=1; $i <= $no_of_rows; $i++ ){
$user_data[] = [
'status' => 'ACTIVE',
'username' => $faker->userName,
'email' => $faker->email,
'password' => $password,
'firstname' => $faker->firstName,
'surname' => $faker->lastName,
'mobilenumber' => $faker->phoneNumber,
'confirmed' => (int)$faker->boolean(50),
'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
'dob' => $faker->date(),
'address_line_1' => $faker->address,
'address_line_2' => '',
'post_code' => $faker->postcode,
];
}
User::insert($user_data);
我收到以下错误消息
PHP Fatal error: Allowed memory size of 1073741824 bytes exhausted
我已设置ini_set('memory_limit', '1024M');
任何有用的想法或解决方案?
答案 0 :(得分:4)
这个问题的核心问题是Faker lib实例(通常用于在Laravel中生成数据)是内存繁重的,并且在大循环中使用它时垃圾收集器无法正确清除它。 / p>
我同意@ Rob Mkrtchyan上面添加的被处理的处理,但由于这是Laravel,我建议使用Factory工具提供更优雅的解决方案。
您可以创建一个特定的模型工厂(在Laravel 5.3中,这应该放在数据库/工厂/中),例如:
$factory->define(Tests::class, function (Faker\Generator $faker) {
return [
'status' => 'ACTIVE',
'username' => $faker->userName,
'email' => $faker->email,
'password' => bcrypt('secret'),
'firstname' => $faker->firstName,
'surname' => $faker->lastName,
'mobilenumber' => $faker->phoneNumber,
'confirmed' => (int)$faker->boolean(50),
'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
'dob' => $faker->date(),
'address_line_1' => $faker->address,
'address_line_2' => '',
'post_code' => $faker->postcode,
];
});
然后在你的dB播种机类中运行工厂很简单。请注意,数字200表示要创建的种子数据条目的数量。
factory(Tests::class, 200)
->create();
使用种子工厂的原因是它允许您更灵活地设置变量等。有关此文档,您可以参考Laravel docs on dB seeding
现在,既然你正在处理大量的记录,那么实现一个有助于php垃圾收集的分块解决方案是微不足道的。例如:
for ($i=0; $i < 5000; $i++) {
factory(Tests::class, 200)
->create();
}
我做了一个快速测试,在这个配置中,无论创建的数据条目如何,你的脚本内存使用量应该在12-15mb左右(当然取决于其他系统因素)。
答案 1 :(得分:2)
foreach
循环中设置的变量永远不会被使用,所以如果foreach循环的唯一目的是添加一百万个记录,你可以取消foreach并使用这样的东西?这样,用于填充数据库的数组在每次迭代时都会重新声明,而不是添加越来越多的条目。
$no_of_rows = 1000000;
for( $i=0; $i < $no_of_rows; $i++ ){
$user_data = array(
'status' => 'ACTIVE',
'username' => $faker->userName,
'email' => $faker->email,
'password' => $password,
'firstname' => $faker->firstName,
'surname' => $faker->lastName,
'mobilenumber' => $faker->phoneNumber,
'confirmed' => (int)$faker->boolean(50),
'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
'dob' => $faker->date(),
'address_line_1' => $faker->address,
'address_line_2' => '',
'post_code' => $faker->postcode,
);
User::insert( $user_data );
$user_data=null;
}
根据您的上一条评论,我可以看到为什么使用块 - 在发布回复之前无法知道sql的语法,所以也许这可能更合适?
$no_of_rows = 1000000;
$range=range( 1, $no_of_rows );
$chunksize=1000;
foreach( array_chunk( $range, $chunksize ) as $chunk ){
$user_data = array();/* array is re-initialised each major iteration */
foreach( $chunk as $i ){
$user_data[] = array(
'status' => 'ACTIVE',
'username' => $faker->userName,
'email' => $faker->email,
'password' => $password,
'firstname' => $faker->firstName,
'surname' => $faker->lastName,
'mobilenumber' => $faker->phoneNumber,
'confirmed' => (int)$faker->boolean(50),
'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
'dob' => $faker->date(),
'address_line_1' => $faker->address,
'address_line_2' => '',
'post_code' => $faker->postcode
);
}
User::insert( $user_data );
}
答案 2 :(得分:1)
您好:这是非常好的而且非常快速的插入数据解决方案
$no_of_data = 1000000;
$test_data = array();
for ($i = 0; $i < $no_of_data; $i++){
$test_data[$i]['number'] = "1234567890";
$test_data[$i]['message'] = "Test Data";
$test_data[$i]['status'] = "Delivered";
}
$chunk_data = array_chunk($test_data, 1000);
if (isset($chunk_data) && !empty($chunk_data)) {
foreach ($chunk_data as $chunk_data_val) {
DB::table('messages')->insert($chunk_data_val);
}
}
答案 3 :(得分:0)
您好:这是一个很好的解决方案
public function run(){
for($j = 1; $j < 1000; $j++){
for($i = 0; $i < 1000; $i++){
$user_data[] = [
'status' => 'ACTIVE',
'username' => $faker->userName,
'email' => $faker->email,
'password' => $password,
'firstname' => $faker->firstName,
'surname' => $faker->lastName,
'mobilenumber' => $faker->phoneNumber,
'confirmed' => (int)$faker->boolean(50),
'gender' => $faker->boolean(50) ? 'MALE' : 'FEMALE',
'dob' => $faker->date(),
'address_line_1' => $faker->address,
'address_line_2' => '',
'post_code' => $faker->postcode,
];
}
User::insert($user_data);
}
}
此代码在内存中仅使用1000个长度数组...您可以在不更改任何默认php设置的情况下运行此代码...
享受,..