org.apache.hadoop.conf.Configuration loadResource错误

时间:2014-11-17 13:44:48

标签: java eclipse apache hadoop

我正在创建一个简单的helloworld hadoop项目。我真的不知道要包含什么来解决这个错误。似乎hadoop库需要一些我不包括的资源。

我尝试将以下参数添加到运行配置中..但它没有帮助解决问题..

-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

这是我的代码:

/**
  * Writes a static string to a file using the Hadoop Libraries
*/
public class WriteToFile {

    public static void main(String[] args) {

        //String to print to file
        final String HELLOWORLD = "Hello World! This is Chris writing to the file.";

        try {
            //Instantiating the configuration
            Configuration conf = new Configuration();

            //Creating the file system
            FileSystem fs = FileSystem.get(conf);

            //Instantiating the path 
            Path path = new Path("/user/c4511/homework1.txt");

            //Checking for the existence of the file
            if(fs.exists(path)){
                //delete if it already exists
                fs.delete(path, true);
            }

            //Creating an output stream
            FSDataOutputStream fsdos = fs.create(path);

            //Writing helloworld static string to the file
            fsdos.writeUTF(HELLOWORLD);

            //Closing all connection
            fsdos.close();
            fs.close();
        } 
        catch (IOException e) {
            e.printStackTrace();
        }
    }
}

导致此问题的原因是什么?

这是我得到的错误

Nov 17, 2014 9:30:30 AM org.apache.hadoop.conf.Configuration loadResource
SEVERE: error parsing conf file: javax.xml.parsers.ParserConfigurationException: Feature     'http://apache.org/xml/features/xinclude' is not recognized.
Exception in thread "main" java.lang.RuntimeException:   javax.xml.parsers.ParserConfigurationException: Feature 'http://apache.org/xml/features/xinclude' is   not recognized.
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1833)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:790)
at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:166)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:158)
at WriteToFile.main(WriteToFile.java:24)
Caused by: javax.xml.parsers.ParserConfigurationException: Feature 'http://apache.org/xml/features/xinclude' is not recognized.
at org.apache.xerces.jaxp.DocumentBuilderFactoryImpl.newDocumentBuilder(Unknown Source)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1720)
... 6 more

1 个答案:

答案 0 :(得分:1)

当我将项目从2.5.1移动到2.6.0时,我的项目中出现了相同的异常。当我将xerces:*添加到着色的jar文件中时,我不得不使用maven pom文件来解决它。

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>emc.lab.hadoop</groupId>
<artifactId>DartAnalytics</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>DartAnalytics</name>
<description>Examples for usage of Dart simulated data</description>
<properties>
    <main.class>OffsetRTMain</main.class>
    <hadoop.version>2.6.0</hadoop.version>
    <minimize.jar>true</minimize.jar>
</properties>
<!-- <repositories> <repository> <id>mvn.twitter</id> <url>http://maven.twttr.com</url> 
    </repository> </repositories> -->
<build>
    <plugins>
        <plugin>
            <!-- The shade plugin allows us to compile the dependencies into the 
                jar file -->
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>2.3</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                </execution>
            </executions>
            <configuration>
                <!-- minimize the jar removes all files that are not addressed in the 
                    file. but the filters include stuff we must include -->
                <minimizeJar>${minimize.jar}</minimizeJar>
                <filters>
                    <filter>
                        <artifact>com.hadoop.gplcompression:hadoop-lzo</artifact>
                        <includes>
                            <include>**</include>
                        </includes>
                    </filter>
                    <filter>
                    <!-- This solves the hadoop 2.6.0 problem with ClassNotFound of "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl" -->
                        <artifact>xerces:*</artifact>
                        <includes>
                            <include>**</include>
                        </includes>
                    </filter>

                    <filter>
                        <artifact>org.apache.hadoop:*</artifact>
                        <excludes>
                            <exclude>**</exclude>
                        </excludes>
                    </filter>
                </filters>
                <finalName>uber-${project.artifactId}-${project.version}</finalName>
                <transformers>
                    <transformer
                        implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                        <mainClass>${main.class}</mainClass>
                    </transformer>
                </transformers>
            </configuration>
        </plugin>

    </plugins>
</build>

<dependencies>
    <!-- you can add this to the local repo by running mvn install:install-file 
        -Dfile=libs/hadoop-lzo-0.4.20-SNAPSHOT.jar -DgroupId=com.hadoop.gplcompression 
        -DartifactId=hadoop-lzo -Dversion=0.4.20 -Dpackaging=jar from the main project 
        directory -->
    <!-- Another option is to build from outside the EMC network and get access 
        to the twitter maven repository by changing the version to a version in the 
        repository and un-commenting the repository addition -->
    <dependency>
        <groupId>com.hadoop.gplcompression</groupId>
        <artifactId>hadoop-lzo</artifactId>
        <version>0.4.20</version>
    </dependency>

    <dependency>
        <groupId>net.sf.trove4j</groupId>
        <artifactId>trove4j</artifactId>
        <version>3.0.3</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-core</artifactId>
        <version>${hadoop.version}</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>${hadoop.version}</version>
    </dependency>

    <dependency>
        <groupId>com.google.protobuf</groupId>
        <artifactId>protobuf-java</artifactId>
        <version>2.5.0</version>
    </dependency>

    <dependency>
        <groupId>com.twitter.elephantbird</groupId>
        <artifactId>elephant-bird-core</artifactId>
        <version>4.5</version>
    </dependency>
    <!-- <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> 
        <version>18.0</version> </dependency> -->


</dependencies>