我正在运行Cloudera / Solr集群,并尝试使用hbase-solr(Lily)索引器对Hbase进行NRT索引到Solr。批处理模式索引工作正常。
然而,在我开始以恒定流加载数据后,Lily索引器开始一个接一个地死亡。他们不会打印出跳出来的特定错误消息,但所有错误都以同样的方式结束:
2014-09-10 16:04:56,770 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ip-172-31-1-204.ap-southeast-2.compute.internal,44013,1410329096767 connecting to ZooKeeper ensemble=ip-172-31-1-205.ap-southeast-2.compute.internal:2181,ip-172-31-1-206.ap-southeast-2.compute.internal:2181,ip-172-31-1-204.ap-southeast-2.compute.internal:2181
2014-09-10 16:04:56,771 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ip-172-31-1-206.ap-southeast-2.compute.internal/172.31.1.206:2181. Will not attempt to authenticate using SASL (unknown error)
2014-09-10 16:04:56,772 INFO org.apache.hadoop.ipc.RpcServer: RpcServer.listener,port=44013: starting
2014-09-10 16:04:56,771 INFO org.apache.hadoop.ipc.RpcServer: RpcServer.responder: starting
2014-09-10 16:04:56,773 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ip-172-31-1-206.ap-southeast-2.compute.internal/172.31.1.206:2181, initiating session
2014-09-10 16:04:56,775 INFO com.ngdata.hbaseindexer.supervisor.IndexerSupervisor: Started indexer for indexFeature
2014-09-10 16:04:56,776 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ip-172-31-1-206.ap-southeast-2.compute.internal/172.31.1.206:2181, sessionid = 0x1485c7ff13602fd, negotiated timeout = 60000
2014-09-10 16:04:56,813 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:04:57,287 INFO org.kitesdk.morphline.api.MorphlineContext: Done importing commands
2014-09-10 16:04:57,289 INFO org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http client, config:
2014-09-10 16:04:57,297 INFO org.apache.hadoop.ipc.RpcServer: regionserver/ip-172-31-1-204.ap-southeast-2.compute.internal/172.31.1.204:0: started 10 reader(s).
2014-09-10 16:04:57,299 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ip-172-31-1-205.ap-southeast-2.compute.internal:2181,ip-172-31-1-206.ap-southeast-2.compute.internal:2181,ip-172-31-1-204.ap-southeast-2.compute.internal:2181 sessionTimeout=60000 watcher=ip-172-31-1-204.ap-southeast-2.compute.internal,44713,1410329097297, quorum=ip-172-31-1-205.ap-southeast-2.compute.internal:2181,ip-172-31-1-206.ap-southeast-2.compute.internal:2181,ip-172-31-1-204.ap-southeast-2.compute.internal:2181, baseZNode=/hbase
2014-09-10 16:04:57,301 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=ip-172-31-1-204.ap-southeast-2.compute.internal,44713,1410329097297 connecting to ZooKeeper ensemble=ip-172-31-1-205.ap-southeast-2.compute.internal:2181,ip-172-31-1-206.ap-southeast-2.compute.internal:2181,ip-172-31-1-204.ap-southeast-2.compute.internal:2181
2014-09-10 16:04:57,302 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ip-172-31-1-204.ap-southeast-2.compute.internal/172.31.1.204:2181. Will not attempt to authenticate using SASL (unknown error)
2014-09-10 16:04:57,303 INFO org.apache.hadoop.ipc.RpcServer: RpcServer.responder: starting
2014-09-10 16:04:57,303 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ip-172-31-1-204.ap-southeast-2.compute.internal/172.31.1.204:2181, initiating session
2014-09-10 16:04:57,304 INFO org.apache.hadoop.ipc.RpcServer: RpcServer.listener,port=44713: starting
2014-09-10 16:04:57,306 INFO com.ngdata.hbaseindexer.supervisor.IndexerSupervisor: Started indexer for indexSeenBlock
2014-09-10 16:04:57,307 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ip-172-31-1-204.ap-southeast-2.compute.internal/172.31.1.204:2181, sessionid = 0x3485c7fee8f0374, negotiated timeout = 60000
2014-09-10 16:04:57,349 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2014-09-10 16:04:57,536 INFO org.mortbay.log: jetty-6.1.26.cloudera.2
2014-09-10 16:04:58,663 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:11060
2014-09-10 16:05:01,591 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,597 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,641 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,650 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,688 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,726 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,732 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,740 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2014-09-10 16:05:01,752 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
Cloudera经理不会提供任何有用的信息,除了说过程已经退出。一些记录确实在Solr索引中得到更新,表明索引器至少运行了一段时间。
我在RHEL6.5和JDK7上运行最新的CDH 5.1。
答案 0 :(得分:1)
如果您正在使用WAL,即在您的hbase客户端程序中写入头日志选项为FALSE,如mapreduce(出于性能原因)。它不会写入HLOG。
NRT适用于WAL / HLOG。
批处理索引工作正常,因为它不适用于WAL