从多个进程写入HBase

时间:2017-08-07 05:29:20

标签: multithreading hadoop thread-safety hbase hdfs

我必须在HBase DB中放入大约2.5 Tb的数据。由于单个进程将其写入数据库需要很长时间,因此我尝试创建多个进程来执行此操作。但我怀疑这样做是否安全,因为我同时从多个进程调用1.1.2.2.6.1.0-129。我在我的系统中使用2017-08-07 11:01:18,311 INFO [hconnection-0x5e834655-shared--pool3-t52497] client.AsyncProcess: #7542, table=clueweb12, attempt=12/35 failed=124ops, last exception: org.apache.hadoop.hbase.RegionTooBusyException: org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, regionName=clueweb12,http://www.innovativeusers.org/list/archives/2006/msg058,1502034291085.f923654226ace7491d61f8b67764c46f., server=node9.local,16020,1499079576247, memstoreSize=539000216, blockingMemStoreSize=536870912 at org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:3824) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2977) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2928) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:748) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:708) at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2124) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32393) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) on node9.local,16020,1499079576247, tracking started null, retrying after=20140ms, replay=124ops 版本的HBase。

我正在通过HBase REST API执行所有写操作。由于进行了多次操作,我在服务器上遇到以下错误:

self

此错误是否会导致信息丢失?

0 个答案:

没有答案