具有大量导入的AWS Aurora内存不足错误

时间:2017-10-25 11:17:27

标签: mysql amazon-web-services amazon-rds amazon-rds-aurora

我正在尝试将数据库从4.5 GB mysqldump文件导入AWS Aurora数据库实例。我的数据库中有大约80个表,最大的表有大约1300万行(其余的都小得多)。根据我在mysqldump中指定的max_allowed_pa​​cket,我的转储文件具有多值插入,my.cnf每个上限为64MB。

我最初在导入时遇到了各种问题,但我可以通过设置Aurora参数组选项来解决这些问题,即

  

max_allowed_pa​​cket = 1073741824(1GB)
  wait_timeout = 10800(3小时)
  net_read_timeout = 10800
  net_write_timeout = 10800
  interactive_timeout = 10800

首先,我的Aurora实例是db.t2.small(2GB RAM),但是当我尝试导入时来自EC2实例(mysql -u ... mydb < dump.sql,4GB RAM)的m3.medium在运行1分钟后进程失败。 RDS日志告诉我这是一个内存不足错误。我将Aurora实例提升到了db.t2.medium(4GB RAM),但是在大约20分钟后,由于同样的内存不足错误,该过程再次失败。

我不想跳到下一个实例类型(15GB RAM),但无论如何,我不得不这样做。我一直在将相同的mysqldump文件定期导入我正在使用的m3.medium EC2实例上的本地MySQL服务器,我从来没有遇到任何问题。导入需要约40分钟。

这是我尝试的最后一次导入的Aurora错误日志:

Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 101748 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
OOM crash avoidance result: success: no num success: 0 system KB: 4050724 available KB: 111688 low-threshold KB: 202536 recovery time: 11 num declined query: 0 num killed query: 0 num killed connection: 0
OOM crash avoidance result: success: yes num success: 1 system KB: 4050724 available KB: 529464 low-threshold KB: 202536 recovery time: 24 num declined query: 0 num killed query: 0 num killed connection: 0
Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 200956 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
OOM crash avoidance result: success: yes num success: 2 system KB: 4050724 available KB: 556020 low-threshold KB: 202536 recovery time: 5 num declined query: 0 num killed query: 0 num killed connection: 0
Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 170392 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
OOM crash avoidance result: success: yes num success: 3 system KB: 4050724 available KB: 554108 low-threshold KB: 202536 recovery time: 7 num declined query: 0 num killed query: 0 num killed connection: 0
Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 194900 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
OOM crash avoidance result: success: yes num success: 4 system KB: 4050724 available KB: 554340 low-threshold KB: 202536 recovery time: 8 num declined query: 0 num killed query: 0 num killed connection: 0
Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 198780 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
OOM crash avoidance result: success: no num success: 4 system KB: 4050724 available KB: 133160 low-threshold KB: 202536 recovery time: 11 num declined query: 0 num killed query: 0 num killed connection: 0
OOM crash avoidance result: success: yes num success: 5 system KB: 4050724 available KB: 556540 low-threshold KB: 202536 recovery time: 25 num declined query: 0 num killed query: 0 num killed connection: 0
Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 170224 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
OOM crash avoidance result: success: yes num success: 6 system KB: 4050724 available KB: 579368 low-threshold KB: 202536 recovery time: 1 num declined query: 0 num killed query: 0 num killed connection: 0
Available memory is low. Trying to avoid OOM crash: system KB: 4050724 available KB: 175612 low-threshold KB: 202536 decline query: no tune caches: no kill query: no kill connection: no
<jemalloc>: Error in mmap(): err: 12, msg: Cannot allocate memory
<jemalloc>: Error in malloc(): out of memory
<jemalloc>: System-wide: MemTotal: 4050724kb, MemFree: 137440kb, Buffers: 20428kb, Cached: 62340kb, Active: 2188968kb, Dirty: 204kb, Inactive: 37228kb, Mapped: 41592kb
<jemalloc>: terminating process due to out of resources
10:58:03 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=6
max_threads=90
thread_count=5
connection_count=5
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63814 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2ab59d623000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 2ab524642c08 thread_stack 0x40000
/rdsdbbin/oscar/bin/mysqld(my_print_stacktrace+0x2c)[0x9897ec]
/rdsdbbin/oscar/bin/mysqld(handle_fatal_signal+0x491)[0x6f0651]
/lib64/libpthread.so.0(+0xf5b0)[0x2ab5165405b0]
/lib64/libc.so.6(gsignal+0x39)[0x2ab51925cbe9]
/lib64/libc.so.6(abort+0x148)[0x2ab51925dfe8]
/rdsdbbin/oscar/lib/libjemalloc.so(malloc+0x1226)[0x2ab5160f62e6]
/rdsdbbin/oscar/bin/mysqld(my_malloc+0x25)[0x986525]
/rdsdbbin/oscar/bin/mysqld(alloc_root+0x8f)[0x9824bf]
/rdsdbbin/oscar/bin/mysqld[0x5b57a5]
/rdsdbbin/oscar/bin/mysqld(_Z10MYSQLparsePv+0xc1ce)[0x826ade]
/rdsdbbin/oscar/bin/mysqld(_Z9parse_sqlP3THDP12Parser_stateP19Object_creation_ctx+0xb5)[0x772175]
/rdsdbbin/oscar/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0xf2)[0x772502]
/rdsdbbin/oscar/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xf43)[0x774003]
/rdsdbbin/oscar/bin/mysqld(_ZN22OscarSchedulerConsumer7consumeEjj+0xd3)[0x803963]
/rdsdbbin/oscar/bin/mysqld(_ZN22OscarSchedulerConsumer5startEv+0x98)[0x803a98]
/rdsdbbin/oscar/bin/mysqld(_ZN22OscarSchedulerConsumer11drain_queueEPv+0x6a)[0x803cda]
/lib64/libpthread.so.0(+0x7f18)[0x2ab516538f18]
/lib64/libc.so.6(clone+0x6d)[0x2ab51930bb2d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (2ab5ad400010): INSERT INTO `Documents` VALUES (478555572,150317,1321817,1,9,627609600,0,60,5471267,0,639014400,''),(478555571,150317,1321816,1,1,623980800,0,60,0,0,623980800,''),(478555575,150318,1321820,1,1,623980800,0,60,0,0,623980800,'')
Connection ID (thread ID): 6
Status: NOT_KILLED

鉴于我的带有4GB RAM的EC2实例可以处理导入OK,当然这必然是配置问题。我可以尝试更改其他一些参数组选项吗?

我还尝试通过在我的数据库群集参数组中设置binlog_format参数为 OFF 来禁用二进制日志记录(根据说明here)并重新启动实例,但是当我运行查询select @@binlog_format时,我得到的结果是 STATEMENT

1 个答案:

答案 0 :(得分:0)

从90开始设置GLOBAL max_connections = 6#将减少[mysqld] RAM要求。 在导入后根据需要提高限制。