Question

我必须将Hive表复制到另一个集群，保持表的架构和层次结构，所以我的问题是：最安全和正确的方法是什么，以便拥有精确的表（和数据库）副本Cluster1进入Cluseter2。

我找到了全球所说的方法：

 - hive > export TABLE1;
 - distcp hdfs:source_Path hdfs:dest_Path
 - hive > import TABLE1; #in Cluster 2
 - hive> MSCK REPAIR TABLE TABLE1;

但是由于我必须复制大量的数据库和表，有没有快速安全的方法，比如将Datawarehouse1的状态或快照复制到Datawarehouse1 ......等等？

提前致谢。

Answer 1

架构迁移（假设hive Metastore存储在MySQL中）

转储Metastore数据库

mysqldump -u **** -p***** metastoredb > metastore.sql

将Cluster1 FS URI替换为Cluster2的FS URI

sed -i 's_hdfs://namenode1:port1_hdfs://namenode2:port2_g' metastore.sql

将转储移动到目标群集并恢复它。

mysql> create database metastoredb;
mysql> use metastoredb;
mysql> source metastore.sql;

如果目标Hive的版本不同，请运行相关的升级脚本。

仓库和外部表的迁移必须使用distcp来保留目录结构。

hadoop distcp hdfs://namenode1:port1/hive/data hdfs://namenode2:port2/hive/data

Hive的export和import没有数据库选项。