我正在尝试使用DistCp
类来模拟命令hadoop distcp -update -delete
。我用先前配置的DistCp
实例化了DistCpOptions
。但这似乎没有得到我通过的配置。当我看到跟踪时,我确认设置的标志没有被激活。
package com.keedio.hadoop;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.tools.DistCp;
import org.apache.hadoop.tools.DistCpOptionSwitch;
import org.apache.hadoop.tools.DistCpOptions;
import org.apache.hadoop.util.ToolRunner;
import java.net.URI;
import java.util.Collections;
public class Mover {
public static void main(String args []) {
try {
String arguments [] = {"file:////home/cloudera/Desktop/files","hdfs:////user/cloudera/test"};
DistCpOptions distCpOptions=new DistCpOptions(new Path("file:////home/cloudera/Desktop/files"),new Path("file:////user/cloudera/test"));
distCpOptions.setSyncFolder(true);
distCpOptions.setDeleteMissing(true);
Configuration conf=new Configuration();
conf.addResource(new org.apache.hadoop.fs.Path("/etc/hadoop/conf/yarn-site.xml"));
conf.addResource(new org.apache.hadoop.fs.Path("/etc/hadoop/conf/core-site.xml"));
conf.addResource(new org.apache.hadoop.fs.Path("/etc/hadoop/conf/mapred-site.xml"));
conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
DistCp distCp= new DistCp(conf,distCpOptions);
ToolRunner.run(distCp,arguments);
}
catch (Exception e) {
e.printStackTrace();
}
}
}
当我运行此代码时,命令行中的配置与我设置的不同。当我在DistCpOptions
中将其指定为true时,syncFolder会显示为false。
输入选项:
DistCpOptions{
atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false,
overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null,
toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20,
mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize',
preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null,
sourceFileListing=null, sourcePaths=[file:/home/cloudera/Desktop/ficheros1],
targetPath=hdfs:/user/cloudera/prueba, targetPathExists=true, filtersFile='null',
blocksPerChunk=0, copyBufferSize=8192
}