我想知道我是否可以使用hbase快照输出进行批量加载? 我正在尝试将数据加载到另一个集群中,而org.apache.hadoop.hbase.snapshot.ExportSnapshot对我来说并不起作用,因为我们有超过1Tb的数据需要传输。
所以我在看Snapshots,看起来创建和导出快照会创建hfiles吗?
hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -snapshot metadata_20150424 -files
2015-05-01 19:41:48,313 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
Snapshot Info
----------------------------------------
Name: metadata_20150424
Type: FLUSH
Table: metadata
Format: 0
Created: 2015-04-24T16:38:50
Snapshot Files
----------------------------------------
8.0 K metadata/0cac6c0df8910c33fd16440be8188612/r/0ead631947ee459e9bc90ffead3f82f3 (archive)
1.5 K metadata/0cac6c0df8910c33fd16440be8188612/r/2c38dd290ae54d98b858f9cc0c00f8a0 (archive)
1.3 K metadata/0cac6c0df8910c33fd16440be8188612/r/6e236ca738974d3b8002b3c34152cd9d (archive)
10.8 K metadata/18719f6ffa69864249392d2412f26538/r/74bfc686e53e42fcae6c2b11508dff77 (archive)
1.2 K metadata/2616b9514013465fa281a12100968ad3/r/1cdcc2ed27d347a8996645454b15cc8a (archive)
<etc>
2015-05-01 19:41:48,718 INFO [main] util.FSVisitor: No logs under directory:hdfs://xxx/apps/hbase/data/.hbase-snapshot/metadata_20150424/WALs
26 HFiles (26 in archive), total size 113.4 K (0.00% 0 shared with the source table)
0 Logs, total size 0
我能够在本地导出文件,然后将它们导入我的其他群集。
将快照导出到本地hdfs后,文件如下所示:
hdfs dfs -ls /dmas/export/.hbase-snapshot/metadata_20150424
Found 13 items
-rw-r--r-- 3 hbase hdfs 64 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/.snapshotinfo
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase- snapshot/metadata_20150424/.tabledesc
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/.tmp
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/0cac6c0df8910c33fd16440be8188612
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/18719f6ffa69864249392d2412f26538
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/2616b9514013465fa281a12100968ad3
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/9dc4e7ce25124514642c5222e1f3299f
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/9f5e696c72f81eeb634a9c24045ba797
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/a346fb7ccb0cd2c77633e2a1322573ef
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/b634a8daa1071cff4b275ea93067ea12
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/ca1f884b9420b778786304bd91374607
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/fb69a83e5e7944ed6bd075529f40765d
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:25 /export/.hbase-snapshot/metadata_20150424/fdd7e3789531046487009c5c73b685ce
在我导出文件的群集上,它们如下所示:
-rw-r--r-- 3 hbase hdfs 64 2015-05-01 19:59 /import/metadata_snapshot/metadata_20150424/.snapshotinfo
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:59 /import/metadata_snapshot/metadata_20150424/.tabledesc
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:59 /import/metadata_snapshot/metadata_20150424/.tmp
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:59 /import/metadata_snapshot/metadata_20150424/0cac6c0df8910c33fd16440be8188612
drwxr-xr-x - hbase hdfs 0 2015-05-01 19:59 /import/metadata_snapshot/metadata_20150424/18719f6ffa69864249392d2412f26538
然后我想我可以运行LoadIncrementalHFiles实用程序:
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles /import/metadata_snapshot/metadata_20150424/ metadata
但是,我收到以下错误。
2015-05-01 20:06:23,447 ERROR [main] mapreduce.LoadIncrementalHFiles: -------------------------------------------------
Bulk load aborted with some files not yet loaded:
-------------------------------------------------
hdfs://hadmas/import/dmas_signer_metadata_snapshot/dmas_signer_metadata_20150424/.tabledesc/.tableinfo.0000000001
hdfs://xxx/import/metadata_snapshot/metadata_20150424 /0cac6c0df8910c33fd16440be8188612/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/0cac6c0df8910c33fd16440be8188612/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/18719f6ffa69864249392d2412f26538/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/18719f6ffa69864249392d2412f26538/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/2616b9514013465fa281a12100968ad3/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/2616b9514013465fa281a12100968ad3/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/9dc4e7ce25124514642c5222e1f3299f/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/9dc4e7ce25124514642c5222e1f3299f/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/9f5e696c72f81eeb634a9c24045ba797/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/9f5e696c72f81eeb634a9c24045ba797/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/a346fb7ccb0cd2c77633e2a1322573ef/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/a346fb7ccb0cd2c77633e2a1322573ef/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/b634a8daa1071cff4b275ea93067ea12/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/b634a8daa1071cff4b275ea93067ea12/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/ca1f884b9420b778786304bd91374607/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/ca1f884b9420b778786304bd91374607/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/fb69a83e5e7944ed6bd075529f40765d/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/fb69a83e5e7944ed6bd075529f40765d/r
hdfs://xxx/import/metadata_snapshot/metadata_20150424/fdd7e3789531046487009c5c73b685ce/.regioninfo
hdfs://xxx/import/metadata_snapshot/metadata_20150424/fdd7e3789531046487009c5c73b685ce/r
Exception in thread "main" java.io.IOException: Unmatched family names found: unmatched family names in HFiles to be bulkloaded: [.tabledesc, 0cac6c0df8910c33fd16440be8188612, 0cac6c0df8910c33fd16440be8188612, 18719f6ffa69864249392d2412f26538, 18719f6ffa69864249392d2412f26538, 2616b9514013465fa281a12100968ad3, 2616b9514013465fa281a12100968ad3, 9dc4e7ce25124514642c5222e1f3299f, 9dc4e7ce25124514642c5222e1f3299f, 9f5e696c72f81eeb634a9c24045ba797, 9f5e696c72f81eeb634a9c24045ba797, a346fb7ccb0cd2c77633e2a1322573ef, a346fb7ccb0cd2c77633e2a1322573ef, b634a8daa1071cff4b275ea93067ea12, b634a8daa1071cff4b275ea93067ea12, ca1f884b9420b778786304bd91374607, ca1f884b9420b778786304bd91374607, fb69a83e5e7944ed6bd075529f40765d, fb69a83e5e7944ed6bd075529f40765d, fdd7e3789531046487009c5c73b685ce, fdd7e3789531046487009c5c73b685ce]; valid family names of table metadata are: [r]
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:243)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:825)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.main(LoadIncrementalHFiles.java:831)
我不确定我所做的事情是否可能,但我想我会尝试它,因为导入/导出实用程序对我不起作用。有谁知道这是否可能,如果是的话,我做错了什么?
提前感谢!