Question

我正在尝试运行HBase importTSV hadoop作业，以便从TSV文件将数据加载到HBase中。我使用以下代码。

    Configuration config = new Configuration();
    Iterator iter = config.iterator();
    while(iter.hasNext())
    {
        Object obj = iter.next();
        System.out.println(obj);
    }

    Job job = new Job(config);
    job.setJarByClass(ImportTsv.class);
    job.setJobName("ImportTsv");
    job.getConfiguration().set("user", "hadoop");
    job.waitForCompletion(true);

我收到此错误

错误security.UserGroupInformation：PriviledgedActionException as：E317376 cause：org.apache.hadoop.security.AccessControlException：org.apache.hadoop.security.AccessControlException：Permission denied：user = E317376，access = WRITE，inode =“staging” ：hadoop的：超组：rwxr-XR-X

我不知道如何设置用户名E317376。这是我尝试在远程群集中运行此作业的Windows机器用户。我在linux机器上的haddop用户帐号是“hadoop”

当我在linux机器上运行它，这是hadoop用户帐户下的Hadoop集群的一部分时，一切运行良好。但我想以编程方式在java Web应用程序中运行此作业。我做错了什么。请帮忙......

Answer 1

你应该在mapred-site.xml文件中有像bellow这样的属性

<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>/user</value>
<property>

并且可能需要将dfs文件系统的/ user文件夹chmod到777

不要忘记停止/启动你的工作人员和任务工作者（sh stop-mapred.sh和sh start-mapred.sh）

Answer 2

我没有测试过这些解决方案，但尝试在您的工作配置中添加类似的内容

conf.set("hadoop.job.ugi", "hadoop");

上面的内容可能是免费的，所以您也可以尝试以下用户设置为hadoop （来自http://hadoop.apache.org/common/docs/r1.0.3/Secure_Impersonation.html的代码）：

UserGroupInformation ugi = 
                     UserGroupInformation.createProxyUser(user, UserGroupInformation.getLoginUser());
             ugi.doAs(new PrivilegedExceptionAction<Void>() {
               public Void run() throws Exception {
                 //Submit a job
                 JobClient jc = new JobClient(conf);
                 jc.submitJob(conf);
                 //OR access hdfs
                 FileSystem fs = FileSystem.get(conf);
                 fs.mkdir(someFilePath); 
               }
             }

远程运行Hbase ImportTSV作业

2 个答案: