Hadoop distcp错误-java.lang.IllegalArgumentException:找不到'key @ 1'

时间:2019-06-13 03:52:57

标签: hadoop encryption distcp tde apache-ranger

我正在尝试使用distcp将数据从源hadoop集群迁移到目标hadoop集群。我在源上拥有的数据位于加密区域(/data/sit)中。想法是将数据从源群集上的加密区域移动到目标群集上的加密区域。

我在目标群集上创建了相同的路径(/data/sit),并在其上创建了一个加密区域。加密密钥由源群集和目标群集上的Ranger KMS管理。用来加密源路径(/data/sit)的Ranger KMS密钥的名称为sit_key。我还在目标上创建了一个具有相同名称的键(尽管distcp确实不需要此键)。

以下是我用于distcp的命令: hadoop distcp -Dmapreduce.job.queuename=${yarn_queue} -Dmapreduce.job.hdfs-servers.token-renewal.exclude=${dest} -skipcrccheck -prbugpc -update -delete "hdfs://${src}/data/sit" "hdfs://${dest}/data/sit"

当我运行上述命令时,几个文件出现以下错误:

19/06/13 03:16:47 INFO mapreduce.Job: Task Id : attempt_1560292609758_0035_m_000011_2, Status : FAILED
Error: java.io.IOException: File copy failed: hdfs://source/data/sit/hdp/datain/YYYY=2019/MM=06/DD=11/1720/ff461c1a-e7b4-4fe7-a421-f9be0bf5b6f2.metadata --> hdfs://target/data/sit/hdp/datain/YYYY=2019/MM=06/DD=11/1720/ff461c1a-e7b4-4fe7-a421-f9be0bf5b6f2.metadata
        at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:299)
        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:266)
        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs://source/data/sit/hdp/datain/YYYY=2019/MM=06/DD=11/1720/ff461c1a-e7b4-4fe7-a421-f9be0bf5b6f2.metadata to hdfs://target/data/sit/hdp/datain/YYYY=2019/MM=06/DD=11/1720/ff461c1a-e7b4-4fe7-a421-f9be0bf5b6f2.metadata
        at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
        at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:296)
        ... 10 more
Caused by: java.lang.IllegalArgumentException: 'sit_key@1' not found
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157)
        at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:608)
        at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:566)
        at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:834)
        at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:210)
        at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:206)
        at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:95)
        at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:206)
        at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
        at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1393)
        at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1463)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:333)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.getInputStream(RetriableFileCopyCommand.java:300)
        at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:249)
        at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:183)
        at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:123)
        at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
        at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
        ... 11 more

不确定是什么原因造成的。有趣的部分是Caused by: java.lang.IllegalArgumentException: 'sit_key@1' not foundsit_key是用于在源路径和目标路径上进行加密的密钥的名称,但不确定@1最后的含义。

来源上的sit_key也已经移到了版本2。不确定这是否是导致问题的原因。

distcp最终由于几个文件的此类错误而失败,在大约500GB的数据中,仅复制了〜128GB。

任何帮助将不胜感激。谢谢。

0 个答案:

没有答案