最近,当我使用Spark和HBase测试我的集群时。我使用newAPIHadoopRDD从HBase表中读取记录。我发现newAPIHadoopRDD太慢了,时间与Region Servers的数量成正比。
下面的spark调试(为测试打开)日志显示了以下过程:
17/03/02 22:00:30 DEBUG AbstractRpcClient: Use SIMPLE authentication for service ClientService, sasl=false
17/03/02 22:00:30 DEBUG AbstractRpcClient: Connecting to slave111/192.168.10.111:16020
17/03/02 22:00:30 DEBUG ClientCnxn: Reading reply sessionid:0x15a8de8a86f0444, packet:: clientPath:null serverPath:null finished:false header:: 5,3 replyHeader:: 5,116079898,0 request:: '/hbase,F response:: s{116070329,116070329,1488462020202,1488462020202,0,16,0,0,0,16,116070652}
17/03/02 22:00:30 DEBUG ClientCnxn: Reading reply sessionid:0x15a8de8a86f0444, packet:: clientPath:null serverPath:null finished:false header:: 6,4 replyHeader:: 6,116079898,0 request:: '/hbase/master,F response:: #ffffffff000146d61737465723a3136303030fffffff4ffffffa23affffffc8ffffffb6ffffffb1ffffffc21a50425546a12a66d617374657210ffffff807d18ffffffcffffffff4fffffffffffffff9ffffffa82b10018ffffff8a7d,s{116070348,116070348,1488462021202,1488462021202,0,0,0,97546372339663909,54,0,116070348}
17/03/02 22:00:30 DEBUG AbstractRpcClient: Use SIMPLE authentication for service MasterService, sasl=false
17/03/02 22:00:30 DEBUG AbstractRpcClient: Connecting to master/192.168.10.100:16000
17/03/02 22:00:30 DEBUG RegionSizeCalculator: Region tt,3,1488442069431.21d34666d310df3f180b2dba093d910d. has size 0
17/03/02 22:00:30 DEBUG RegionSizeCalculator: Region tt,,1488442069431.cb8696957957f824f1a16210768bf197. has size 0
17/03/02 22:00:30 DEBUG RegionSizeCalculator: Region tt,1,1488442069431.274ddaa4abb34f0408cac0f33107529c. has size 0
17/03/02 22:00:30 DEBUG RegionSizeCalculator: Region tt,2,1488442069431.05dd84aacb7f2587e325c8baf4c27613. has size 0
17/03/02 22:00:30 DEBUG RegionSizeCalculator: Region sizes calculated
17/03/02 22:00:38 DEBUG Client: IPC Client (480943798) connection to master/192.168.10.100:9000 from hadoop: closed
17/03/02 22:00:38 DEBUG Client: IPC Client (480943798) connection to master/192.168.10.100:9000 from hadoop: stopped, remaining connections 0
17/03/02 22:00:43 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:00:56 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:01:00 DEBUG TableInputFormatBase: getSplits: split -> 0 -> HBase table split(table name: tt, scan: , start row: , end row: 1, region location: slave104)
17/03/02 22:01:10 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:01:23 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:01:30 DEBUG TableInputFormatBase: getSplits: split -> 1 -> HBase table split(table name: tt, scan: , start row: 1, end row: 2, region location: slave102)
17/03/02 22:01:37 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:01:50 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:02:00 DEBUG TableInputFormatBase: getSplits: split -> 2 -> HBase table split(table name: tt, scan: , start row: 2, end row: 3, region location: slave112)
17/03/02 22:02:03 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:02:17 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:02:30 DEBUG ClientCnxn: Got ping response for sessionid: 0x15a8de8a86f0444 after 0ms
17/03/02 22:02:30 DEBUG TableInputFormatBase: getSplits: split -> 3 -> HBase table split(table name: tt, scan: , start row: 3, end row: , region location: slave108)
17/03/02 22:02:30 INFO ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
17/03/02 22:02:30 INFO ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15a8de8a86f0444
17/03/02 22:02:30 DEBUG ZooKeeper: Closing session: 0x15a8de8a86f0444
17/03/02 22:02:30 DEBUG ClientCnxn: Closing client for session: 0x15a8de8a86f0444
17/03/02 22:02:30 DEBUG ClientCnxn: Reading reply sessionid:0x15a8de8a86f0444, packet:: clientPath:null serverPath:null finished:false header:: 7,-11 replyHeader:: 7,116080795,0 request:: null response:: null
17/03/02 22:02:30 DEBUG ClientCnxn: Disconnecting client for session: 0x15a8de8a86f0444
17/03/02 22:02:30 INFO ZooKeeper: Session: 0x15a8de8a86f0444 closed
17/03/02 22:02:30 INFO ClientCnxn: EventThread shut down
17/03/02 22:02:30 DEBUG AbstractRpcClient: Stopping rpc client
17/03/02 22:02:30 DEBUG ClientCnxn: An exception was thrown while closing send thread for session 0x15a8de8a86f0444 : Unable to read additional data from server sessionid 0x15a8de8a86f0444, likely server has closed socket
17/03/02 22:02:30 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.rdd.RDD$$anonfun$count$1) +++
我使用的是Spark 2.1.0,HBase 1.1.2。 getSplits操作花了太多时间。区域服务器编号从1到4进行了测试,每个区域服务器需要30秒。 HBase表不包含任何记录(仅用于测试)。
这是正常的吗?有没有人和我一样遭受同样的问题?
测试代码如下所示:
Configuration hconf = HBaseConfiguration.create();
hconf.set(TableInputFormat.INPUT_TABLE, GLOBAL.TABLE_NAME);
hconf.set("hbase.zookeeper.quorum", "192.168.10.100");
hconf.set("hbase.zookeeper.property.clientPort", "2181");
Scan scan = new Scan();
JavaPairRDD<ImmutableBytesWritable, Result> results
= sc.newAPIHadoopRDD(hconf, TableInputFormat.class, ImmutableBytesWritable.class, Result.class);
long cnt = results.count();
System.out.println(cnt);
修改
用HBase源代码调试后,我找到了速度慢的原因。来自 TableInputFormatBase.java 的反向DNS 操作是罪魁祸首。
ipAddressString = DNS.reverseDns(ipAddress, null);
现在如何解决这个问题?我可以在HBase配置中添加一些dns-ip对吗?
答案 0 :(得分:1)
使用nslookup反向查找192.168.10.100时,我得到了以下结果。
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached
所以,我执行了下面的cmds,
sudo iptables -t nat -A POSTROUTING -s 192.168.10.0/24 -o em4 -j MASQUERADE
sudo sysctl -w net.ipv4.ip_forward=1
sudo route add default gw 'mygatway' em4
然后,问题就消失了。