基本上我正在将CSV文件索引到Cassandra,并在一段时间后出现此错误:
failed to create a child event loop
java.lang.IllegalStateException: failed to create a child event loop
at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68)
at io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:50)
at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:70)
at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:65)
at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:56)
at com.datastax.driver.core.NettyUtil.newEventLoopGroupInstance(NettyUtil.java:139)
at com.datastax.driver.core.NettyOptions.eventLoopGroup(NettyOptions.java:99)
at com.datastax.driver.core.Connection$Factory.<init>(Connection.java:774)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1446)
at com.datastax.driver.core.Cluster.init(Cluster.java:159)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:330)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:305)
at com.datastax.driver.core.Cluster.connect(Cluster.java:247)
at com.dy.scyllaindexer.Indexer_SlaveActor$$anonfun$receive$1.applyOrElse(Indexer_SlaveActor.scala:38)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at com.dy.scyllaindexer.Indexer_SlaveActor.aroundReceive(Indexer_SlaveActor.scala:24)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:527)
at akka.actor.ActorCell.invoke(ActorCell.scala:496)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: io.netty.channel.ChannelException: failed to open a new selector
at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:176)
at io.netty.channel.nio.NioEventLoop.<init>(NioEventLoop.java:150)
at io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:103)
at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64)
... 24 more
Caused by: java.io.IOException: Too many open files
at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:130)
at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:69)
at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:174)
... 27 more
我跑
lsof -p MY_PROCESS_ID
我看到许多FIFO管道被创建(数千)
java 16082 my_process *706w FIFO 0,8 0t0 285381393
pipe
...
...
thousands
我的过程在给定时间创建了12个演员(6个奴隶和每个奴隶另一个工人)
我正在使用平行度为100的Akka流读取本地CSV文件(我尝试使用许多不同的值)并写入cassandra async
代码与此类似:
class SlaveActor(...) extends Actor {
def receive {
case DoSomething => {
val indexer = context.actorOf(CassandraIndexer.props(...))
val message = Message(...)
val f = (indexer ? message)
val ff = f andThen {
case x: Try[..] => indexer ! PoisonPill ... // force it
}
...
ff.onComplete {
case Failure... => {}
case Success ... => {}
}
}
}
}
class CassandraIndexer (...) extends Actor{
def receive {
case Message(...) =>
implicit val session = Cluster.builder().addContactPoints(hosts).withPort(port).build().connect()
val flow: Sink[Map[String, String], Future[Done]] = Flow[Map[String, String]].mapAsyncUnordered(parallelism = 100) {
item: Map[String, String] =>
Future {
val query = session....bind(item...)
session.execute(query)
}
}
}.toMat(Sink.ignore)(Keep.right)
CsvSourceMaker.createSourceFromFile(csvfile).runWith(flow)
... when completed ...
session.close()
}
}
答案 0 :(得分:1)
我认为有一些事情可以促成这一点:
DoSomething
收到的每条SlaveActor
邮件,您正在创建一个CassandraIndexer
角色。为什么不只为您的CassandraIndexer
提供1 SlaveActor
个Message
actor实例收到的每个CassandraIndexer
都会为Cluster
actor中收到的每条消息创建一个CassandraIndexer
实例。 Cluster
是一个相对较重的对象,为cassandra集群中的每个主机创建一个连接池(有关详细信息,请参阅4 simple rules when using the DataStax drivers for cassandra)。这些连接池中的套接字连接可能是正在创建的许多文件描述符的来源。我建议做以下事情:
CassandraIndexer
只有1 SlaveActor
名演员。Cluster
个CassandraIndexer
,或整体只有1 Cluster
。这会将您限制为每Cluster
1 SlaveActor
,并减少与您的C *群集的连接数量,并且可能会加快您的应用程序,因为Cluster
初始化每次执行查询时都不需要发生。