Flyway定期挂起(MariaDB连接器,RDS)

时间:2017-06-08 16:13:21

标签: java mysql flyway rds amazon-ecs

在迁移步骤中,我一直看到部署频繁挂起。用于Tomcat的WAR打包的Java / Scala应用程序。数据库是使用MariaDB连接器(https://downloads.mariadb.org/connector-java/)的RDS Aurora。

可能与Flyway无关,但是获得连接是一个普遍的问题。

从容器中的shell运行迁移:

java -cp `echo WEB-INF/lib/*|tr ' ' :` foo.Migrate

迁移代码如下:

def main(args: Array[String]): Unit = {
  Environment.dbFlywayPassword.foreach { pass =>
    val flyway = new Flyway
    flyway.setDataSource(Environment.jdbcUrl, "flyway", pass)
    flyway.migrate
  }
}

连接字符串:

jdbc:mysql:aurora://%RDS_HOST%/xxx?serverSslCert=/rds-ca-2015-root.pem&useSSL=true&connectTimeout=10000

我尝试在Flyway中增加日志记录级别,但在此行之后没有记录任何内容:

15:57:35.115 [main] INFO  o.f.c.internal.util.VersionPrinter - Flyway 4.2.0 by Boxfuse

所以我得到了一个线程转储,看起来像这样:

15:57:35.115 [main] INFO  o.f.c.internal.util.VersionPrinter - Flyway 4.2.0 by Boxfuse
2017-06-08 15:57:56
Full thread dump OpenJDK 64-Bit Server VM (25.121-b13 mixed mode):

"MariaDb-failover-1" #8 daemon prio=5 os_prio=0 tid=0x00005555f80ae000 nid=0x14 waiting on condition [0x00007fc330b8f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000f5c59b10> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00005555f70bf000 nid=0x12 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00005555f7063000 nid=0x11 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00005555f7060800 nid=0x10 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00005555f705e800 nid=0xf waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00005555f702f000 nid=0xe in Object.wait() [0x00007fc331616000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000f5a30c58> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
    - locked <0x00000000f5a30c58> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00005555f702c000 nid=0xd in Object.wait() [0x00007fc331717000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000f5a30e10> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
    - locked <0x00000000f5a30e10> (a java.lang.ref.Reference$Lock)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00005555f6f85000 nid=0xb runnable [0x00007fc34341d000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
    at sun.security.ssl.InputRecord.read(InputRecord.java:503)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
    - locked <0x00000000f0a66090> (a java.lang.Object)
    at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
    at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
    - locked <0x00000000f0a81eb0> (a sun.security.ssl.AppInputStream)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    - locked <0x00000000f06b5118> (a java.io.BufferedInputStream)
    at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray(StandardPacketInputStream.java:125)
    at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacket(StandardPacketInputStream.java:95)
    at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:1002)
    at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:982)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.readRequestSessionVariables(AbstractConnectProtocol.java:498)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.readPipelineAdditionalData(AbstractConnectProtocol.java:544)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:410)
    at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:357)
    at org.mariadb.jdbc.internal.protocol.AuroraProtocol.loop(AuroraProtocol.java:149)
    at org.mariadb.jdbc.internal.failover.impl.AuroraListener.reconnectFailedConnection(AuroraListener.java:179)
    at org.mariadb.jdbc.internal.failover.impl.MastersSlavesListener.initializeConnection(MastersSlavesListener.java:154)
    at org.mariadb.jdbc.internal.failover.FailoverProxy.<init>(FailoverProxy.java:94)
    at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:464)
    at org.mariadb.jdbc.Driver.connect(Driver.java:103)
    at org.flywaydb.core.internal.util.jdbc.DriverDataSource.getConnectionFromDriver(DriverDataSource.java:416)
    at org.flywaydb.core.internal.util.jdbc.DriverDataSource.getConnection(DriverDataSource.java:381)
    at org.flywaydb.core.internal.util.jdbc.JdbcUtils.openConnection(JdbcUtils.java:51)
    at org.flywaydb.core.Flyway.execute(Flyway.java:1418)
    at org.flywaydb.core.Flyway.migrate(Flyway.java:971)
    at tgam.service.data.Migrate$.$anonfun$main$1(Migrate.scala:11)
    at foo.Migrate$.$anonfun$main$1$adapted(Migrate.scala:8)
    at foo.Migrate$$$Lambda$4/458209687.apply(Unknown Source)
    at scala.Option.foreach(Option.scala:257)
    at foo.Migrate$.main(Migrate.scala:8)
    at foo.Migrate.main(Migrate.scala)

"VM Thread" os_prio=0 tid=0x00005555f7022000 nid=0xc runnable

"VM Periodic Task Thread" os_prio=0 tid=0x00005555f70da800 nid=0x13 waiting on condition

JNI global references: 232

Heap
 def new generation   total 4928K, used 1611K [0x00000000f0600000, 0x00000000f0b50000, 0x00000000f5950000)
  eden space 4416K,  24% used [0x00000000f0600000, 0x00000000f0712c10, 0x00000000f0a50000)
  from space 512K, 100% used [0x00000000f0a50000, 0x00000000f0ad0000, 0x00000000f0ad0000)
  to   space 512K,   0% used [0x00000000f0ad0000, 0x00000000f0ad0000, 0x00000000f0b50000)
 tenured generation   total 10944K, used 4187K [0x00000000f5950000, 0x00000000f6400000, 0x0000000100000000)
   the space 10944K,  38% used [0x00000000f5950000, 0x00000000f5d66e78, 0x00000000f5d67000, 0x00000000f6400000)
 Metaspace       used 13061K, capacity 13306K, committed 13568K, reserved 1060864K
  class space    used 1381K, capacity 1449K, committed 1536K, reserved 1048576K

看起来I / O挂起org.mariadb.jdbc.Driver.connect,但我确实设置了connectTimeout(10秒)。此超时似乎没有效果(我需要socketTimeouthttps://github.com/brettwooldridge/HikariCP/issues/754吗?)

这已经发生了一段时间。当我使用Tomcat的contextInitialized挂钩进行迁移时,也发生了同样的事情。在开始使用Tomcat之前,我决定重构一个单独的调用,这看起来更好,但它并没有影响这种行为。

通常会发生的情况是,代码将挂起,2-3分钟后ECS将超时,并触发重新部署。经过一些重试(例如最多10次)后,Flyway将成功运行并开始服务。

1 个答案:

答案 0 :(得分:2)

好吧,看来这是RDS Aurora中的一个已知错误 - 并且它在MariaDB Connector doc中提到过(当然它应该是运行时警告!)

https://mariadb.com/kb/en/mariadb/about-mariadb-connector-j/#infrequently-used

  

usePipelineAuth 与极光不兼容在连接期间,   执行不同的查询。当选项处于活动时,这些查询   使用管道发送(所有查询都发送,然后只发送所有结果   读取),允许更快的连接创建。默认值:true。   自1.6.0起

同样感谢Freenode #maria 中的wlad_,他指出了我正确的方向。