Question

我正在调试应用程序中的一些Posgtres连接泄漏。几天前我们突然超过了100个连接，当时我们不应该 - 因为我们只有8个独角兽工人和一个sidekiq进程（25个线程）。

我今天看着htop，看到我的独角兽工人正在产生大量的线索。例如：

我读得对吗？这不应该发生吗？如果这些是产生的线程，任何想法如何调试呢？

谢谢！顺便说一句，我的另一个问题 - （Postgres连接）Debugging unicorn postgres connection leak

修改

我刚刚在这里提到了一些提示 - http://varaneckas.com/blog/ruby-tracing-threads-unicorn/ - 当我从工人的线程中打印出堆栈跟踪时，这就是我有很多线程的时候所得到的。

[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] -------------------

这是我的unicorn.rb https://gist.github.com/steverob/b83e41bb49d78f9aa32f79136df5af5f，它在after_fork中为EventMachine生成一个线程。

EventMachine的原因是 - ＆gt; https://github.com/keenlabs/keen-gem#asynchronous-publishing

这是正常的吗？线程不应该被杀死吗？这是否也会导致不必要的数据库连接打开？感谢

更新我刚刚发现我使用的是旧版本的PubNub gem，它使用了EM，我在pubnub.log文件中遇到了这些行 -

D, [2016-04-06T21:31:12.130123 #1573] DEBUG -- pubnub: Created event Pubnub::Publish
D, [2016-04-06T21:31:12.130144 #1573] DEBUG -- pubnub: Pubnub::SingleEvent#fire
D, [2016-04-06T21:31:12.130162 #1573] DEBUG -- pubnub: Pubnub::SingleEvent#fire | Adding event to async_events
D, [2016-04-06T21:31:12.130178 #1573] DEBUG -- pubnub: Pubnub::SingleEvent#fire | Starting railgun
D, [2016-04-06T21:31:12.130194 #1573] DEBUG -- pubnub: Pubnub::Client#start_event_machine | starting EM in new thread
D, [2016-04-06T21:31:12.130243 #1573] DEBUG -- pubnub: Pubnub::Client#start_event_machine | We aren't running on thin
D, [2016-04-06T21:31:12.130264 #1573] DEBUG -- pubnub: Pubnub::Client#start_event_machine | EM already running

Answer 1

所以，毕竟，在你的特定情况下，这种行为似乎是正常的。

您提供的unicorn线程堆栈跟踪（使用this method获得）指向the spawn_threadpool method in EventMachine。当其他一些代码调用EventMachine.defer时调用EventMachine中的代码，这是一种在第一次调用时 spawns默认情况下为20个线程的池的方法。我发现在EventMachine.defer宝石的旧版本（例如here）中使用了pubnub，但也可以在其他地方使用EventMachine.threadpool_size = 5。

所以，我认为这解释了你在每个工人身上观察到的大量线程。它们大多等待pop method暂停线程，直到队列中的某些东西被推入（在EventMachine中再次延迟）。因此，除非你有大量的延迟操作，否则线程几乎什么都不做。

如果您不需要在每个独角兽工作人员上准备20个线程以准备进行可延迟操作（很可能您不会），您可以尝试降低池中线程的数量将threadpoolsize variable 设置为合理的数字，例如：

after_fork

我会把它放在unicorn配置的pubnub块中。

此外，作为另一种选择，您可以考虑使用unicorn-worker-killer gem定期杀死独角兽的工作人员。

顺便说一句，<select id="mySelect"> <option value="a1" id="level1">Level1</option> <option value="a1" id="level2">Level2</option> </select> <div id="Demo"> </div>在其日志中吐出的消息似乎没问题，因为它只是告诉我们它找到了一个已初始化的EventMachine线程，因此它不必启动新的线程。 This source code澄清了它。

Answer 2

今天在版本4中遇到此问题。在后台工作程序中使用PubNub时，线程数将继续攀升，直到出现错误。解决方法如下：

client = Pubnub.new(...)
client.publish(...)
client.telemetry.terminate

独角兽

2 个答案: