加载数据本地Infile缓慢并崩溃MySQL

时间:2017-07-25 02:29:30

标签: php mysql laravel innodb myisam

我试图在我的Laravel Web App中将巨大的csv文件导入我的数据库:500,000行,93列,文件大小约700mb。

根据我的信息,处理这一数据量的最佳方法是使用MyISAM& LOAD DATA LOCAL INFILE查询,但是当我尝试加载文件时,我的脚本会挂起几个小时。取消它时,似乎MySQL服务器崩溃了 - 在我手动重启mysql服务之前,无法运行迁移或其他插入。

这是我的代码:

DB::connection()->getpdo()->exec("set unique_checks = 0;");
DB::connection()->getpdo()->exec("set foreign_key_checks = 0;");
DB::connection()->getpdo()->exec("alter table task_metas disable keys;");

 $query = "LOAD DATA LOCAL INFILE '$path'
    INTO  TABLE task_metas
    CHARACTER SET utf8mb4
    FIELDS TERMINATED BY '$delimiter'
    OPTIONALLY ENCLOSED BY '$enclosed'
    LINES TERMINATED BY '$lineending'
    IGNORE 1 LINES
    (@col1,     @col2, @col3, @col4, @col5, @col6, @col7, @col8, @col9,
                @col10, @col11, @col12, @col13, @col14, @col15, @col16,
                @col17, @col18, @col19, @col20, @col21, @col22, @col23,
                @col24, @col25, @col26, @col27, @col28, @col29, @col30,
                @col31, @col32, @col33, @col34, @col35, @col36, @col37,
                @col38, @col39, @col40, @col41, @col42, @col43, @col44,
                @col45, @col46, @col47, @col48, @col49, @col50, @col51,
                @col52, @col53, @col54, @col55, @col56, @col57, @col58,
                @col59, @col60, @col61, @col62, @col63, @col64, @col65,
                @col66, @col67, @col68, @col69, @col70, @col71, @col72,
                @col73, @col74, @col75, @col76, @col77, @col78, @col79,
                @col80, @col81, @col82, @col83, @col84, @col85, @col86,
                @col87, @col88, @col89, @col90, @col91, @col92, @col93
    )
SET task_id=null,
        project_id=$project->id, full_comment=null, relevance=-1,
        note=null, url=@col1, indexed=@col2, published=@col3,
        search_indexed=@col4, title_snippet=@col5, content_snippet=@col6,
        title=@col7, content=@col8, root_url=@col9, domain_url=@col10,
        host_url=@col11, parent_url=@col12, lang=@col13, porn_level=@col14,
        fluency_level=@col15, spam_level=@col16, sentiment=@col17,
        source_type=@col18, post_type=@col19, cluster_id=@col20,
        meta_cluster_id=@col21, tags_internal=@col22, tags_marking=@col23,
        tags_customer=@col24, entity_urls=@col25, images_url=@col26,
        images_width=@col27, images_height=@col28, images_legend=@col29,
        videos_url=@col30, videos_width=@col31, videos_height=@col32,
        videos_legend=@col33, pagemonitoring_sitemon_siteid=@col34,
        matched_profile=@col35, article_extended_attributes_facebook_shares=@col36,
        article_extended_attributes_facebook_likes=@col37, article_extended_attributes_twitter_retweets=@col38,
        article_extended_attributes_url_views=@col39, article_extended_attributes_pinterest_likes=@col40,
        article_extended_attributes_pinterest_pins=@col41, article_extended_attributes_pinterest_repins=@col42,
        article_extended_attributes_youtube_views=@col43, article_extended_attributes_youtube_likes=@col44, article_extended_attributes_youtube_dislikes=@col45, article_extended_attributes_instagram_likes=@col46, article_extended_attributes_twitter_shares=@col47, article_extended_attributes_num_comments=@col48, source_extended_attributes_alexa_pageviews=@col49, source_extended_attributes_facebook_followers=@col50, source_extended_attributes_twitter_followers=@col51, source_extended_attributes_instagram_followers=@col52, source_extended_attributes_pinterest_followers=@col53, extra_article_attributes_world_data_continent=@col54, extra_article_attributes_world_data_country=@col55, extra_article_attributes_world_data_country_code=@col56, extra_article_attributes_world_data_region=@col57, extra_article_attributes_world_data_city=@col58, extra_article_attributes_world_data_longitude=@col59, extra_article_attributes_world_data_latitude=@col60, extra_author_attributes_id=@col61, extra_author_attributes_type=@col62, extra_author_attributes_name=@col63, extra_author_attributes_birthdate_date=@col64, extra_author_attributes_birthdate_resolution=@col65, extra_author_attributes_gender=@col66, extra_author_attributes_image_url=@col67, extra_author_attributes_short_name=@col68, extra_author_attributes_url=@col69, extra_author_attributes_world_data_continent=@col70, extra_author_attributes_world_data_country=@col71, extra_author_attributes_world_data_country_code=@col72, extra_author_attributes_world_data_region=@col73, extra_author_attributes_world_data_city=@col74, extra_author_attributes_world_data_longitude=@col75,
        extra_author_attributes_world_data_latitude=@col76, extra_source_attributes_world_data_continent=@col77, extra_source_attributes_world_data_country=@col78, extra_source_attributes_world_data_country_code=@col79, extra_source_attributes_world_data_region=@col80, extra_source_attributes_world_data_city=@col81, extra_source_attributes_world_data_longitude=@col82, extra_source_attributes_world_data_latitude=@col83, engagement=@col84, reach=@col85, provider=@col86, generator_type=@col87, source_extended_attributes_alexa_unique_visitors=@col88, article_extended_attributes_twitter_likes=@col89, extra_author_attributes_description=@col90, article_extended_attributes_linkedin_shares=@col91,
        extra_source_attributes_name=@col92, word_count=@col93,
        created_at=NOW(), updated_at=NOW()";

    $results = DB::connection()->getpdo()->exec($query);

    DB::connection()->getpdo()->exec("set unique_checks = 1;");
    DB::connection()->getpdo()->exec("set foreign_key_checks = 1;");
    DB::connection()->getpdo()->exec("alter table task_metas enable keys;");

我的源文件也不容易处理 - 它的列名与我的表列名不匹配,我的数据库中还有一些我在查询中手动设置的其他列

我还尝试通过删除alter table查询并在

中包含加载数据查询来使用InnoDB
DB::connection()->getpdo()->exec("set autocommit = 0;");
// query as above
DB::connection()->getpdo()->exec("commit;");

我还将innodb-autoinc-lock-mode文件中的mysql.conf变量设置为2,但我的行为与myisam相同。

我不知道它是否正常,但在查询过程中我看不到数据库中发生的任何事情,也没有插入新行。这是正常的吗?

我还尝试使用一个较小的文件有170,000行 - 同样的结果。

我的服务器配置是一个DigitalOcean Droplet,有2个CPU和2GB的RAM,运行Ubuntu 16.05,应用程序部署Laravel Forge。 Web服务器是Nginx 11.x

1 个答案:

答案 0 :(得分:0)

原来问题不在于服务器 - 我只是设置错误的分隔符&行结束字符!

我在Excel 2016 Mac上创建了我的测试数据,结果是默认情况下它将分隔符设置为分号,新行字符设置为false。我只是改变了这些变量,现在在20秒内输入所有内容。