我试图在我的Laravel Web App中将巨大的csv文件导入我的数据库:500,000行,93列,文件大小约700mb。
根据我的信息,处理这一数据量的最佳方法是使用MyISAM
& LOAD DATA LOCAL INFILE
查询,但是当我尝试加载文件时,我的脚本会挂起几个小时。取消它时,似乎MySQL服务器崩溃了 - 在我手动重启mysql
服务之前,无法运行迁移或其他插入。
这是我的代码:
DB::connection()->getpdo()->exec("set unique_checks = 0;");
DB::connection()->getpdo()->exec("set foreign_key_checks = 0;");
DB::connection()->getpdo()->exec("alter table task_metas disable keys;");
$query = "LOAD DATA LOCAL INFILE '$path'
INTO TABLE task_metas
CHARACTER SET utf8mb4
FIELDS TERMINATED BY '$delimiter'
OPTIONALLY ENCLOSED BY '$enclosed'
LINES TERMINATED BY '$lineending'
IGNORE 1 LINES
(@col1, @col2, @col3, @col4, @col5, @col6, @col7, @col8, @col9,
@col10, @col11, @col12, @col13, @col14, @col15, @col16,
@col17, @col18, @col19, @col20, @col21, @col22, @col23,
@col24, @col25, @col26, @col27, @col28, @col29, @col30,
@col31, @col32, @col33, @col34, @col35, @col36, @col37,
@col38, @col39, @col40, @col41, @col42, @col43, @col44,
@col45, @col46, @col47, @col48, @col49, @col50, @col51,
@col52, @col53, @col54, @col55, @col56, @col57, @col58,
@col59, @col60, @col61, @col62, @col63, @col64, @col65,
@col66, @col67, @col68, @col69, @col70, @col71, @col72,
@col73, @col74, @col75, @col76, @col77, @col78, @col79,
@col80, @col81, @col82, @col83, @col84, @col85, @col86,
@col87, @col88, @col89, @col90, @col91, @col92, @col93
)
SET task_id=null,
project_id=$project->id, full_comment=null, relevance=-1,
note=null, url=@col1, indexed=@col2, published=@col3,
search_indexed=@col4, title_snippet=@col5, content_snippet=@col6,
title=@col7, content=@col8, root_url=@col9, domain_url=@col10,
host_url=@col11, parent_url=@col12, lang=@col13, porn_level=@col14,
fluency_level=@col15, spam_level=@col16, sentiment=@col17,
source_type=@col18, post_type=@col19, cluster_id=@col20,
meta_cluster_id=@col21, tags_internal=@col22, tags_marking=@col23,
tags_customer=@col24, entity_urls=@col25, images_url=@col26,
images_width=@col27, images_height=@col28, images_legend=@col29,
videos_url=@col30, videos_width=@col31, videos_height=@col32,
videos_legend=@col33, pagemonitoring_sitemon_siteid=@col34,
matched_profile=@col35, article_extended_attributes_facebook_shares=@col36,
article_extended_attributes_facebook_likes=@col37, article_extended_attributes_twitter_retweets=@col38,
article_extended_attributes_url_views=@col39, article_extended_attributes_pinterest_likes=@col40,
article_extended_attributes_pinterest_pins=@col41, article_extended_attributes_pinterest_repins=@col42,
article_extended_attributes_youtube_views=@col43, article_extended_attributes_youtube_likes=@col44, article_extended_attributes_youtube_dislikes=@col45, article_extended_attributes_instagram_likes=@col46, article_extended_attributes_twitter_shares=@col47, article_extended_attributes_num_comments=@col48, source_extended_attributes_alexa_pageviews=@col49, source_extended_attributes_facebook_followers=@col50, source_extended_attributes_twitter_followers=@col51, source_extended_attributes_instagram_followers=@col52, source_extended_attributes_pinterest_followers=@col53, extra_article_attributes_world_data_continent=@col54, extra_article_attributes_world_data_country=@col55, extra_article_attributes_world_data_country_code=@col56, extra_article_attributes_world_data_region=@col57, extra_article_attributes_world_data_city=@col58, extra_article_attributes_world_data_longitude=@col59, extra_article_attributes_world_data_latitude=@col60, extra_author_attributes_id=@col61, extra_author_attributes_type=@col62, extra_author_attributes_name=@col63, extra_author_attributes_birthdate_date=@col64, extra_author_attributes_birthdate_resolution=@col65, extra_author_attributes_gender=@col66, extra_author_attributes_image_url=@col67, extra_author_attributes_short_name=@col68, extra_author_attributes_url=@col69, extra_author_attributes_world_data_continent=@col70, extra_author_attributes_world_data_country=@col71, extra_author_attributes_world_data_country_code=@col72, extra_author_attributes_world_data_region=@col73, extra_author_attributes_world_data_city=@col74, extra_author_attributes_world_data_longitude=@col75,
extra_author_attributes_world_data_latitude=@col76, extra_source_attributes_world_data_continent=@col77, extra_source_attributes_world_data_country=@col78, extra_source_attributes_world_data_country_code=@col79, extra_source_attributes_world_data_region=@col80, extra_source_attributes_world_data_city=@col81, extra_source_attributes_world_data_longitude=@col82, extra_source_attributes_world_data_latitude=@col83, engagement=@col84, reach=@col85, provider=@col86, generator_type=@col87, source_extended_attributes_alexa_unique_visitors=@col88, article_extended_attributes_twitter_likes=@col89, extra_author_attributes_description=@col90, article_extended_attributes_linkedin_shares=@col91,
extra_source_attributes_name=@col92, word_count=@col93,
created_at=NOW(), updated_at=NOW()";
$results = DB::connection()->getpdo()->exec($query);
DB::connection()->getpdo()->exec("set unique_checks = 1;");
DB::connection()->getpdo()->exec("set foreign_key_checks = 1;");
DB::connection()->getpdo()->exec("alter table task_metas enable keys;");
我的源文件也不容易处理 - 它的列名与我的表列名不匹配,我的数据库中还有一些我在查询中手动设置的其他列
我还尝试通过删除alter table查询并在
中包含加载数据查询来使用InnoDBDB::connection()->getpdo()->exec("set autocommit = 0;");
// query as above
DB::connection()->getpdo()->exec("commit;");
我还将innodb-autoinc-lock-mode
文件中的mysql.conf
变量设置为2
,但我的行为与myisam相同。
我不知道它是否正常,但在查询过程中我看不到数据库中发生的任何事情,也没有插入新行。这是正常的吗?
我还尝试使用一个较小的文件只有170,000行 - 同样的结果。
我的服务器配置是一个DigitalOcean Droplet,有2个CPU和2GB的RAM,运行Ubuntu 16.05,应用程序部署Laravel Forge。 Web服务器是Nginx 11.x
答案 0 :(得分:0)
原来问题不在于服务器 - 我只是设置错误的分隔符&行结束字符!
我在Excel 2016 Mac上创建了我的测试数据,结果是默认情况下它将分隔符设置为分号,新行字符设置为false
。我只是改变了这些变量,现在在20秒内输入所有内容。