试图连接到HDFS的Java抛出HADOOP_HOME未设置,找不到winutils

时间:2017-06-23 15:15:03

标签: java hadoop hdfs

我正在尝试将应用程序原型化以使用Hadoop作为数据存储区,并且我在第一个障碍时摔倒了。我可以访问一个Hadoop集群,我从Spring获取了一个测试样本,试用了第一个婴儿步骤:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hdfs.DistributedFileSystem;
import org.junit.jupiter.api.Test;
import java.io.PrintWriter;
import java.net.URI;
import java.util.Scanner;

public class HdfsTest {

    @Test
    public void testHdfs() throws Exception {    

        System.setProperty("HADOOP_USER_NAME", "adam");

        // Path that we need to create in HDFS.
        // Just like Unix/Linux file systems, HDFS file system starts with "/"
        final Path path = new Path("/usr/adam/junk.txt");

        // Uses try with resources in order to avoid close calls on resources
        // Creates anonymous sub class of DistributedFileSystem to allow calling
        // initialize as DFS will not be usable otherwise
        try (
                final DistributedFileSystem dFS
                        = new DistributedFileSystem() {
                    {
                        initialize(new URI(
                                "hdfs://hanameservice/user/adam"),
                                new Configuration());
                    }
                };
                // Gets output stream for input path using DFS instance
                final FSDataOutputStream streamWriter = dFS.create(path);
                // Wraps output stream into PrintWriter to use high level
                // and sophisticated methods
                final PrintWriter writer = new PrintWriter(streamWriter);
                ) {
            // Writes tutorials information to file using print writer
            writer.println("bungalow bill");
            writer.println("what did you kill");
            System.out.println("File Written to HDFS successfully!");
        }
    }

这些是我正在使用的Hadoop库:

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>2.8.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.8.1</version>
    </dependency>

我可以错过依赖吗?

这是带有错误的日志记录 - 但似乎有2个单独的错误。

2017-06-23 16:01:38.787  WARN   --- [           main] org.apache.hadoop.util.Shell             : Did not find winutils.exe: {}

java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
    at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:528)
    at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:549)
    at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:572)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:669)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
    at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1445)
    at org.apache.hadoop.fs.FileSystem.initialize(FileSystem.java:221)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:145)
    at com.bp.gis.tardis.HdfsTest$1.<init>(HdfsTest.java:34)
    at com.bp.gis.tardis.HdfsTest.testHdfs(HdfsTest.java:31)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:316)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:114)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.lambda$invokeTestMethod$6(MethodTestDescriptor.java:171)
    at org.junit.jupiter.engine.execution.ThrowableCollector.execute(ThrowableCollector.java:40)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.invokeTestMethod(MethodTestDescriptor.java:168)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:115)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:57)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:81)
    at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
    at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
    at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:51)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:43)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:137)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:87)
    at org.junit.platform.launcher.Launcher.execute(Launcher.java:93)
    at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:61)
    at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
    at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:448)
    at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:419)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:496)
    ... 35 common frames omitted

2017-06-23 16:01:39.449  WARN   --- [           main] org.apache.hadoop.util.NativeCodeLoader  : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

java.lang.IllegalArgumentException: java.net.UnknownHostException: hanameservice

    at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
    at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:130)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:343)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:287)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:156)
    at com.bp.gis.tardis.HdfsTest$1.<init>(HdfsTest.java:34)
    at com.bp.gis.tardis.HdfsTest.testHdfs(HdfsTest.java:31)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:316)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:114)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.lambda$invokeTestMethod$6(MethodTestDescriptor.java:171)
    at org.junit.jupiter.engine.execution.ThrowableCollector.execute(ThrowableCollector.java:40)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.invokeTestMethod(MethodTestDescriptor.java:168)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:115)
    at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:57)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:81)
    at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
    at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
    at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:51)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:43)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:137)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:87)
    at org.junit.platform.launcher.Launcher.execute(Launcher.java:93)
    at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:61)
    at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.net.UnknownHostException: hanameservice
    ... 36 more

如何解决这个问题?我正在尝试连接的Hadoop集群的联系人不熟悉hdfs:协议,他们的参考框架似乎都是手动的而不是程序化的。他们希望我登录到边缘节点并在shell中运行脚本。我觉得我应该问他们特别的问题,但我不确定是什么。

1 个答案:

答案 0 :(得分:1)

有两个不同的问题:

  1. 您似乎是从Windows主机运行的。在Windows上,Hadoop需要本机代码扩展,以便它可以正确地与操作系统集成,以实现文件访问语义和权限。请注意,异常消息包含指向Apache Hadoop Wiki页面的链接:WindowsProblems。该页面包含有关如何处理此问题的信息。
  2. 无法与主机&#34; hanameservice&#34;建立套接字连接。这很可能不是真实姓名,而是用于HDFS High Availability的逻辑名称。在内部,HDFS客户端代码会将此逻辑名称映射到2个真实NameNode主机名中的1个,但仅限于配置完成时。您可能没有来自群集的完整配置文件集(core-site.xml和hdfs-site.xml)。您需要在本地系统上进行完整配置才能实现此目的。
  3.   

    他们希望我登录到边缘节点并在shell中运行脚本。

    总的来说,这可能是您的最短路径,而不是尝试完成Windows集成和配置。如果您将代码包装在Hadoop Tool界面中,将其构建为jar,然后将该jar复制到边缘节点,那么您就可以将其作为hadoop jar your-app.jar运行。您将在已知的工作环境中运行,无需排除本机代码扩展,也无需担心配置是否完整以及是否与群集配置保持同步。