从spark

时间:2016-09-23 10:34:06

标签: mysql exception apache-spark apache-spark-sql

我在运行spark代码从MYsql中获取数据时遇到以下异常。可以soneone请帮助。

代码在

之下
    private static final String MYSQL_CONNECTION_URL = "jdbc:mysql://localhost:3306/company";
    private static final String MYSQL_USERNAME = "test";
    private static final String MYSQL_PWD = "test123";

    private static final SparkSession sparkSession =
            SparkSession.builder().master("local[*]").appName("Spark2JdbcDs")
                       .config("spark.sql.warehouse.dir", "file:///tmp/tmp_warehouse") 
                       .getOrCreate();

    public static void main(String[] args) {
        //JDBC connection properties
        final Properties connectionProperties = new Properties();
        connectionProperties.put("user", MYSQL_USERNAME);
        connectionProperties.put("password", MYSQL_PWD);

        Dataset<Row> jdbcDF = sparkSession.sql("SELECT * FROM emp");

        List<Row> employeeFullNameRows = jdbcDF.collectAsList();

        for (Row employeeFullNameRow : employeeFullNameRows) {
            LOGGER.info(employeeFullNameRow);
        }

16/09/23 13:17:55 INFO internal.SharedState:仓库路径为'file:/// tmp / tmp_warehouse'。 16/09/23 13:17:55 INFO execution.SparkSqlParser:解析命令:SELECT * FROM emp 线程“main”中的异常java.lang.UnsupportedOperationException:未由DistributedFileSystem FileSystem实现实现     在org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:217)     在org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2624)     在org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2624)     在org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2634)     在org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)     在org.apache.hadoop.fs.FileSystem.access $ 200(FileSystem.java:92)     在org.apache.hadoop.fs.FileSystem $ Cache.getInternal(FileSystem.java:2687)     在org.apache.hadoop.fs.FileSystem $ Cache.get(FileSystem.java:2669)     在org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)     在org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)     在org.apache.spark.sql.catalyst.catalog.SessionCatalog.makeQualifiedPath(SessionCatalog.scala:115)     在org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:145)     在org.apache.spark.sql.catalyst.catalog.SessionCatalog。(SessionCatalog.scala:89)     at org.apache.spark.sql.internal.SessionState.catalog $ lzycompute(SessionState.scala:95)     at org.apache.spark.sql.internal.SessionState.catalog(SessionState.scala:95)     在org.apache.spark.sql.internal.SessionState $$ anon $ 1.(SessionState.scala:112)     at org.apache.spark.sql.internal.SessionState.analyzer $ lzycompute(SessionState.scala:112)     在org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:111)     在org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)     在org.apache.spark.sql.Dataset $ .ofRows(Dataset.scala:64)     在org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:382)     在org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:238)     在org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:194)     在sparksql.sparksql1.main(sparksql1.java:40)

下面是pom文件

    <!-- Hadoop Mapreduce Client Core -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-core</artifactId>
        <version>2.7.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.7.1</version>
    </dependency>

    <!-- Hadoop Core -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>1.2.1</version>
    </dependency>

    <!-- Spark  -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>2.0.0</version>
    </dependency>

    <!-- Spark SQL  -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
        <version>2.0.0</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>5.1.20</version>
    </dependency>

1 个答案:

答案 0 :(得分:0)

你在pom.xml中添加了hadoop-core和hadoop-common。 删除hadoop核心并尝试