运行很长时间的打开文件太多Kafka Exception

时间:2018-04-06 17:06:27

标签: apache-spark apache-kafka kafka-producer-api

我在Java中有一个Kafka生产者代码,它使用java nio WatchService api监视新文件的目录,并获取任何新文件并推送到kafka主题。 Spark流媒体消费者从kafka主题中读取。在Kafka生产者工作持续运行一天后,我收到以下错误。制片人每2分钟推送约500个文件。我的Kafka主题有1个分区和2个复制因子。有人可以帮忙吗?

org.apache.kafka.common.KafkaException: Failed to construct kafka producer         
    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:342) 
    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:166) 
    at com.hp.hawkeye.HawkeyeKafkaProducer.Sender.createProducer(Sender.java:60) 
    at com.hp.hawkeye.HawkeyeKafkaProducer.Sender.<init>(Sender.java:38)   
    at com.hp.hawkeye.HawkeyeKafkaProducer.HawkeyeKafkaProducer.<init>(HawkeyeKafkaProducer.java:54) 
    at com.hp.hawkeye.HawkeyeKafkaProducer.myKafkaTestJob.main(myKafkaTestJob.java:81)

Caused by: org.apache.kafka.common.KafkaException: java.io.IOException: Too many open files
    at org.apache.kafka.common.network.Selector.<init>(Selector.java:125)
    at org.apache.kafka.common.network.Selector.<init>(Selector.java:147)  
    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:306)

... 7 more 
Caused by: java.io.IOException: Too many open files         
     at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)         
     at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:130)        
     at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:69)      
     at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) 
     at java.nio.channels.Selector.open(Selector.java:227)         
     at org.apache.kafka.common.network.Selector.<init>(Selector.java:123)     
 ... 9 more

1 个答案:

答案 0 :(得分:1)

选中ulimit -aH

与您的管理员联系并增加打开文件的大小,例如:

open files                      (-n) 655536

否则,我怀疑您的代码中可能存在泄漏,请参阅:

http://mail-archives.apache.org/mod_mbox/spark-user/201504.mbox/%3CCAKWX9VVJZObU9omOVCfPaJ_bPAJWiHcxeE7RyeqxUHPWvfj7WA@mail.gmail.com%3E