Cassandra(2.0.4)由于太多打开文件而下降

时间:2017-07-06 02:59:35

标签: cassandra datastax cassandra-2.0

目前,我们正在使用cassandra 2.0.14版。机器在集群中下降,我在日志中看到以下异常。

WARN [New I/O server boss #33] 2017-07-06 06:37:33,097 Slf4JLogger.java (line 76) Failed to accept a connection.
java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
        at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
        at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
ERROR [COMMIT-LOG-ALLOCATOR] 2017-07-06 06:37:33,123 StorageService.java (line 377) Stopping RPC server
 INFO [COMMIT-LOG-ALLOCATOR] 2017-07-06 06:37:33,123 ThriftServer.java (line 141) Stop listening to thrift clients
ERROR [COMMIT-LOG-ALLOCATOR] 2017-07-06 06:37:33,132 StorageService.java (line 382) Stopping native transport
 INFO [COMMIT-LOG-ALLOCATOR] 2017-07-06 06:37:34,965 Server.java (line 182) Stop listening for CQL clients
ERROR [COMMIT-LOG-ALLOCATOR] 2017-07-06 06:37:34,969 CommitLog.java (line 390) Failed to allocate new commit log segments. Commit disk failure policy is stop; terminating thread
FSWriteError in /myntra/cassandra/commitlog/CommitLog-3-1499285518666.log
        at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:143)
        at org.apache.cassandra.db.commitlog.CommitLogSegment.freshSegment(CommitLogSegment.java:90)
        at org.apache.cassandra.db.commitlog.CommitLogAllocator.createFreshSegment(CommitLogAllocator.java:262)
        at org.apache.cassandra.db.commitlog.CommitLogAllocator.access$500(CommitLogAllocator.java:50)
        at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:109)
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: /myntra/cassandra/commitlog/CommitLog-3-1499285518666.log (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:125)
        ... 6 more 

我们根据数据传输生产建议增加了资源限制。 Cassandra由root用户运行,root用户的文件描述符限制为

[root@lgp-feed-cassandra2 cassandra]# ulimit -n
120000

来自正在运行的过程的限制

[root@lgp-feed-cassandra2 cassandra]# cat /proc/117845/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            10485760             unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             32768                32768                processes
Max open files            120000               120000               files
Max locked memory         unlimited            unlimited            bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       255823               255823               signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

无法弄清楚此问题的确切原因。任何领导都会有所帮助。

1 个答案:

答案 0 :(得分:1)

需要设置ulimit,首先通过命令“ ulimit -n”检查ulimit。 我们通过以下更改来实现:

root hard nofile 65535

root soft nofile 65535

硬nofile 65535

软nofile 65535

$ sudo cat /etc/security/limits.conf

Refer this link for more details