我为我的客户管理一个rails应用程序,最近它崩溃了。在我发现之前,该网站已经停机了9个小时。我检查了日志,过去9小时的每个请求都附加了以下代码:
at=error code=H10 desc="App crashed"
在此之前,我会看到以下日志:
2012-11-16T00:55:46+00:00 heroku[web.1]: Idling
2012-11-16T00:55:50+00:00 heroku[web.1]: Stopping all processes with SIGTERM
2012-11-16T00:55:51+00:00 app[web.1]: [2012-11-16 00:55:51] ERROR SignalException: SIGTERM
2012-11-16T00:55:51+00:00 app[web.1]: /usr/local/lib/ruby/1.9.1/webrick/server.rb:90:in `select'
2012-11-16T00:56:00+00:00 heroku[web.1]: Error R12 (Exit timeout) -> At least one process failed to exit within 10 seconds of SIGTERM
2012-11-16T00:56:00+00:00 heroku[web.1]: Stopping remaining processes with SIGKILL
2012-11-16T00:56:02+00:00 heroku[web.1]: State changed from up to down
2012-11-16T00:56:02+00:00 heroku[web.1]: Process exited with status 137
2012-11-16T01:03:55+00:00 heroku[web.1]: Unidling
2012-11-16T01:03:55+00:00 heroku[web.1]: State changed from down to starting
2012-11-16T01:03:59+00:00 heroku[web.1]: Starting process with command `bundle exec rails server -p 4303`
2012-11-16T01:04:00+00:00 heroku[nginx]: 98.139.241.251 - - [16/Nov/2012:01:04:00 +0000] "GET / HTTP/1.1" 499 0 "-" "YahooCacheSystem" domain.com
2012-11-16T01:04:22+00:00 app[web.1]: => Ctrl-C to shutdown server
2012-11-16T01:04:22+00:00 app[web.1]: ** [NewRelic][11/16/12 01:04:21 +0000 b8af98a1-2246-4b34-9dfe-61b9d4b747bc (2)] INFO : Dispatcher: webrick
2012-11-16T01:04:22+00:00 app[web.1]: ** [NewRelic][11/16/12 01:04:21 +0000 b8af98a1-2246-4b34-9dfe-61b9d4b747bc (2)] INFO : Application: acsolar
2012-11-16T01:04:22+00:00 app[web.1]: ** [NewRelic][11/16/12 01:04:21 +0000 b8af98a1-2246-4b34-9dfe-61b9d4b747bc (2)] INFO : New Relic Ruby Agent 3.4.0.1 Initialized: pid = 2
2012-11-16T01:04:22+00:00 app[web.1]: => Booting WEBrick
2012-11-16T01:04:22+00:00 app[web.1]: => Rails 3.1.1 application starting in production on http://0.0.0.0:4303
2012-11-16T01:04:22+00:00 app[web.1]: => Call with -d to detach
2012-11-16T01:04:25+00:00 app[web.1]: [DEPRECATION] Your applications public directory contains an assets/products and/or assets/taxons subdirectory.
2012-11-16T01:04:25+00:00 app[web.1]: Run `rake spree:assets:relocate_images` to relocate the images.
2012-11-16T01:04:34+00:00 app[web.1]: ** [NewRelic][11/16/12 01:04:32 +0000 b8af98a1-2246-4b34-9dfe-61b9d4b747bc (2)] INFO : Reporting performance data every 60 seconds.
2012-11-16T01:04:34+00:00 app[web.1]: Connected to NewRelic Service at collector-5.newrelic.com
2012-11-16T01:05:00+00:00 heroku[web.1]: Error R10 (Boot timeout) -> Web process failed to bind to $PORT within 60 seconds of launch
2012-11-16T01:05:00+00:00 heroku[web.1]: Stopping process with SIGKILL
2012-11-16T01:05:02+00:00 heroku[web.1]: Process exited with status 137
2012-11-16T01:05:02+00:00 heroku[web.1]: State changed from crashed to down
2012-11-16T01:05:02+00:00 heroku[web.1]: State changed from starting to crashed
我猜它可能已经停止运行并且在启动时出错,但是为什么它在没有重新启动的情况下仍处于崩溃状态?如果将来再次发生这种情况,有什么办法可以让它自动重启吗?
我也有NewRelic在这上面运行它根本没有通知我,但这是我要调查的另一个问题。
答案 0 :(得分:4)
Heroku的支持回答建议使用heroku restart
手动重启您的应用。他们现在正在解决这个问题。
嗨,我们这边的流程管理错误导致了一些崩溃的应用程序 只运行1个web dyno报告为“空闲”,即使它们是 实际上已经崩溃了。这意味着坠毁的dyno永远不会 重新启动,导致后续请求失败。我们已经确定了这一点 问题并正在实施修复。如果您的应用仍然没有响应, 请尝试使用heroku restart命令重新启动它。请让 我们知道您是否需要更多帮助。谢谢,Heroku支持