在没有HDFS的情况下执行Hadoop代码

时间:2014-05-20 07:54:06

标签: java hadoop hdfs

运行访问hdfs的代码时遇到问题:

hadoop jar

以下是我尝试运行的代码:

package com.infotel.mycompany.testhdfs;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;


/**
 * Hello world!
 *
 */
public class App
{
    public static void main( String[] args ) throws IOException
    {
        Configuration config = new Configuration();
        config.addResource("/opt/hadoop-2.2.0/etc/hadoop/core-site.xml");
        config.set("fs.defaultFS", "hdfs://192.168.2.164/");

        FileSystem dfs = FileSystem.get(config);

        Path pt = new Path("/path/to/myfile");
        BufferedReader br = new BufferedReader(new InputStreamReader(dfs.open(pt)));
        String line;
        line = br.readLine();
        while(line != null) {
                System.out.println(line);
                line = br.readLine();
        }
    }
}

我使用maven使用以下pom.xml构建代码:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.mycompany.bigdata</groupId>
    <artifactId>testhdfs</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>testhdfs</name>
    <url>http://maven.apache.org</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>0.20.2</version>
        </dependency>

    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>exec-maven-plugin</artifactId>
                <configuration>
                    <mainClass>com.infotel.bigdata.testhdfs.App</mainClass>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

当我使用此命令运行我的代码时:

hadoop jar target/testhdfs-0.0.1-SNAPSHOT.jar com.infotel.mycompany.testhdfs.App

工作正常。但是,如果我使用此命令或从Eclipse运行我的代码:

mvn exec:java

我收到以下错误:

java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: File /path/to/myfile does not exist.
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
        at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:125)
        at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356)
        at com.infotel.bigdata.testhdfs.App.main(App.java:30)

这是我的core-site.xml:

<configuration>    
    <property>
         <name>fs.defaultFS</name>   
         <value>hdfs://chef</value>
         <description>NameNode URI </description>
    </property> 
    <property>       
        <name>hadoop.http.staticuser.user</name>  
        <value>hdfs</value>  
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/hadoop-2.2.0/tmp</value>
    </property>
</configuration>

当我尝试运行MapReduce时遇到同样的问题,我必须使用Hadoop jar命令而不是从Eclipse运行它。我正在使用Hadoop 2.2.0。看起来我完全错过了一些东西或者不理解某些东西,谷歌搜索并没有帮助我。

如果有人有解决方案,我将非常感谢。最终目的是从servlet中的HDFS中检索文件,这就是我无法使用hadoop jar

的原因

2 个答案:

答案 0 :(得分:0)

在此语句中,您可以添加启动名称节点的端口。

config.set("fs.defaultFS", "hdfs://192.168.2.164/");

Apache Hadoop中名称节点的默认端口是8020。

config.set("fs.defaultFS", "hdfs://192.168.2.164:8020");

答案 1 :(得分:0)

我找到了解决方案。 首先,我尝试使用此命令使用Hadoop安装中的所有库运行我的代码:

java -cp `hadoop classpath`:target/testhdfs-0.0.1-SNAPSHOT.jar com.infotel.bigdata.testhdfs.App

所以我发现这是一个jar版本的问题。通过盲目搜索,我将我的pom.xml更改为:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>

        <groupId>com.mycompany.bigdata</groupId>
        <artifactId>testhdfs</artifactId>
        <version>0.0.1-SNAPSHOT</version>
        <packaging>jar</packaging>

        <name>testhdfs</name>
        <url>http://maven.apache.org</url>

        <properties>
                <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        </properties>

        <dependencies>
                <dependency>
                        <groupId>junit</groupId>
                        <artifactId>junit</artifactId>
                        <version>3.8.1</version>
                        <scope>test</scope>
                </dependency>
                <dependency>
                        <groupId>org.apache.hadoop</groupId>
                        <artifactId>hadoop-common</artifactId>
                        <version>2.2.0</version>
                </dependency>
                <dependency>
                        <groupId>org.apache.hadoop</groupId>
                        <artifactId>hadoop-hdfs</artifactId>
                        <version>2.2.0</version>
                </dependency>

        </dependencies>

        <build>
                <plugins>
                        <plugin>
                                <groupId>org.codehaus.mojo</groupId>
                                <artifactId>exec-maven-plugin</artifactId>
                                <configuration>
                                        <mainClass>com.infotel.bigdata.testhdfs.App</mainClass>
                                </configuration>
                        </plugin>
                </plugins>
        </build>
</project>

现在mvn exec:java工作正常。也许hadoop-core应该只用于旧的Hadoop版本? 有人知道更多关于hadoop-core和hadoop-common的信息吗?