我正在实现一个Scala函数,该函数解析JSON文件并将数据加载到HBase中。它在本地机器上运行良好,该机器的zookeeper服务器是localhost。
现在出于扩展目的,我试图在具有三个特定zookeeper集群node00,node01,solr的Hadoop集群上实现这一点。
我想知道如何通过我的代码访问特定的群集。我需要事先连接它们,因为始终存在以下错误。
17/12/27 18:20:17 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting to reconnect
java.net.ConnectException: Connection refused
请回复我这个
EDIT1:
非常感谢您的回复。 我无法通过您建议的方法进行检查,因为我没有足够的权限。 以下是接受输入JSON并解析为HBase的代码。
val input_base =""
val files = List("ffff")
for (collection_name <- files){
val input_filename = input_base + collection_name + ".json"
var col_name = collection_name
var collection_Id = "1007"
@transient val hadoopConf = new Configuration()
@transient val conf = HBaseConfiguration.create(hadoopConf)
conf.set(TableOutputFormat.OUTPUT_TABLE, tableName)
@transient val jobConfig: JobConf = new JobConf(conf, this.getClass)
jobConfig.setOutputFormat(classOf[TableOutputFormat])
jobConfig.set(TableOutputFormat.OUTPUT_TABLE, tableName)
sc.textFile(input_filename).map(l => parse_json(l,col_name,collection_Id)).saveAsHadoopDataset(jobConfig)
}
我需要帮助将这些节点作为参数提供给我的代码。 zookeeper仲裁将solr2.dlrl,node02.dlrl,node00.dlrl作为节点,将2181作为端口。 请帮助我
如果我运行rowcounter函数,它可以正确连接到集群
hbase org.apache.hadoop.hbase.mapreduce.RowCounter ram
如下所示:
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-696.6.3.el6.x86_64
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:user.name=cs5604f17_cta
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/cs5604f17_cta
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/cs5604f17_cta/ram/cmt/CMT/cs5604f17_cmt
18/01/02 14:52:41 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=solr2.dlrl:2181,node02.dlrl:2181,node00.dlrl:2181 sessionTimeout=60000 watcher=hconnection-0x524af2a60x0, quorum=solr2.dlrl:2181,node02.dlrl:2181,node00.dlrl:2181, baseZNode=/hbase
18/01/02 14:52:41 INFO zookeeper.ClientCnxn: Opening socket connection to server solr2.dlrl/10.0.0.125:2181. Will not attempt to authenticate using SASL (unknown error)
18/01/02 14:52:41 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /10.0.0.127:54862, server: solr2.dlrl/10.0.0.125:2181
18/01/02 14:52:41 INFO zookeeper.ClientCnxn: Session establishment complete on server solr2.dlrl/10.0.0.125:2181, sessionid = 0x160aa7b7c311ea2, negotiated timeout = 60000
18/01/02 14:52:41 INFO util.RegionSizeCalculator: Calculating region sizes for table "ram-irma".
18/01/02 14:52:42 INFO client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
18/01/02 14:52:42 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x160aa7b7c311ea2
18/01/02 14:52:42 INFO zookeeper.ZooKeeper: Session: 0x160aa7b7c311ea2 closed
18/01/02 14:52:42 INFO zookeeper.ClientCnxn: EventThread shut down
18/01/02 14:52:42 INFO mapreduce.JobSubmitter: number of splits:12
18/01/02 14:52:42 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
18/01/02 14:52:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1514688768112_0007
18/01/02 14:52:43 INFO impl.YarnClientImpl: Submitted application application_1514688768112_0007
18/01/02 14:52:43 INFO mapreduce.Job: The url to track the job: http://node02.dlrl:8088/proxy/application_1514688768112_0007/
18/01/02 14:52:43 INFO mapreduce.Job: Running job: job_1514688768112_0007
18/01/02 14:52:48 INFO mapreduce.Job: Job job_1514688768112_0007 running in uber mode : false