我已经远程托管了一个庞大的CSV文件(2.4 GB)(s3),我试图将其收集到我的rails应用程序中。
我已将它加载到temp中并且似乎工作正常,但在我开始摄取/迭代文件后大约十分钟,连接始终在我SIGTERM
上终止。
我正在使用mysql 0.3.20运行rails 4.2。
我错过了什么?我该如何完成这项工作?
rake aborted!
SignalException: SIGTERM
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:80:in `_query'
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:80:in `block in query'
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:79:in `handle_interrupt'
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:79:in `query'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `block in execute'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_adapter.rb:466:in `block in log'
/app/vendor/bundle/ruby/2.2.0/gems/activesupport-4.2.0/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_adapter.rb:460:in `log'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `execute'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/mysql2_adapter.rb:231:in `execute'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/mysql2_adapter.rb:235:in `exec_query'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:336:in `select'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/query_cache.rb:70:in `select_all'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:38:in `select_one'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:43:in `select_value'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/relation/finder_methods.rb:314:in `exists?'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/querying.rb:3:in `exists?'
答案 0 :(得分:0)
您可以通过两种方式执行此操作:使用SmarterCSV gem并首先在本地测试它以确保它可以处理大小。如果尺寸不是问题,这将是您最好的选择,因为它使得在输入之前处理大型csv变得非常容易。如果这不起作用,您可以这样做:
使用mysql的导入功能(这里讨论:http://dev.mysql.com/doc/refman/5.7/en/mysqlimport.html)首先将数据直接抛出到mysql中的表中。然后,您可以使用rails find_each
方法遍历记录并将数据传输到适当的表,以避免垃圾收集器过载。我不确定导入功能是否与postgres'COPY
的工作方式相同,但是如果它确实你确保在rails中创建一个没有主键的表来保存初始数据传输,如果你的csv文件没有主键列。