方案中没有FileSystem:hdfs,在Java程序中

时间:2014-06-30 08:35:58

标签: java mysql hadoop hive sqoop

执行这个java代码时,我有问题将表从mysql导入hive:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import com.cloudera.sqoop.Sqoop;
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.SqoopOptions.FileLayout;
import com.cloudera.sqoop.tool.ImportTool;
import com.mysql.jdbc.*;

public class SqoopExample {
    public static void main(String[] args) throws Exception {

        String driver = "com.mysql.jdbc.Driver";
        Class.forName(driver).newInstance();

        Configuration config = new Configuration();
        config.addResource(new Path("/home/socio/hadoop/etc/hadoop/core-site.xml"));
        config.addResource(new Path("/home/socio/hadoop/etc/hadoop/hdfs-site.xml"));

        FileSystem dfs = FileSystem.get(config);   

        SqoopOptions options = new SqoopOptions();

        options.setDriverClassName(driver);
        options.setConf(config);
        options.setHiveTableName("tlinesuccess");
        options.setConnManagerClassName("org.apache.sqoop.manager.GenericJdbcManager");
        options.setConnectString("jdbc:mysql://dba-virtual-machine/test");
        options.setHadoopMapRedHome("/home/socio/hadoop");
        options.setHiveHome("/home/socio/hive");
        options.setTableName("textlines");
        options.setColumns(new String[] {"line"});
        options.setUsername("socio");
        options.setNumMappers(1);
        options.setJobName("Test Import");
        options.setOverwriteHiveTable(true);
        options.setHiveImport(true);
        options.setFileLayout(FileLayout.TextFile);

        int ret = new ImportTool().run(options);
    }
}

结果:

Exception in thread "main" java.io.IOException: No FileSystem for scheme: hdfs
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
    at SqoopExample.main(SqoopExample.java:22)

我指定此命令有效sqoop import --connect jdbc:mysql://dba-virtual-machine/test \--username socio --table textlines \--columns line --hive-import。 我可以使用命令从shell导入mysql,问题在于java代码。

非常感谢任何帮助/想法。

由于

3 个答案:

答案 0 :(得分:1)

HDFS文件系统在库hadoop-hdfs-2.0.0-cdhX.X.X.jar中定义。如果您将其作为java程序执行,则需要将此库添加到classpath。

或者这个库可以在hadoop classpath中使用。创建一个jar文件并使用hadoop命令执行jar。

答案 1 :(得分:1)

如果您使用的是Maven,这也是一个很好的解决方案

https://stackoverflow.com/a/28135140/3451801

基本上你需要在pom依赖项中添加hadoop-hdfs。

答案 2 :(得分:1)

在制作maven jar时添加此插件,它会将所有文件系统合并为一个,同时添加hadoop-hdfs,hadoop-client依赖...

        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>1.5</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>

                    <configuration>
                        <filters>
                            <filter>
                                <artifact>*:*</artifact>
                                <excludes>
                                    <exclude>META-INF/*.SF</exclude>
                                    <exclude>META-INF/*.DSA</exclude>
                                    <exclude>META-INF/*.RSA</exclude>
                                </excludes>
                            </filter>
                        </filters>
                        <shadedArtifactAttached>true</shadedArtifactAttached>
                        <shadedClassifierName>allinone</shadedClassifierName>
                        <artifactSet>
                            <includes>
                                <include>*:*</include>
                            </includes>
                        </artifactSet>
                        <transformers>
                            <transformer
                                implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                <resource>reference.conf</resource>
                            </transformer>
                            <transformer
                                implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                            </transformer>
                            <transformer 
                            implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer">
                            </transformer>
                        </transformers>
                    </configuration>
                </execution>
            </executions>
        </plugin>