Spark错误:GenericRowWithSchema无法强制转换为scala.collection.mutable.WrappedArray

时间:2017-10-18 16:33:38

标签: scala apache-spark

我正在使用spark 1.6并尝试获取并转换数据帧行值。

这是我的问题: 我的数据框中有一行有这种结构:

WrappedArray([List of String], [List of String]) 

我需要在WrappedArray中使用[List of String],所以我尝试使用此代码进行强制转换:

 val RDD= DF.map(

    f => {

      if(f.getAs("ListOfRficAction")!=null){
        var listActions = f.getAs("ColumnName").asInstanceOf[WrappedArray[List[List[Any]]]] .map(m=>m:+f.getAs("AssetId").toString)


    })

我有以下错误:

java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast to scala.collection.mutable.WrappedArray

我知道如何施展它吗?

3 个答案:

答案 0 :(得分:1)

答案 1 :(得分:0)

感谢回答,我使用的是maven项目而不是sbt。我的代码编译没有问题,并且spark错误将我发送到这一行:var listActions = f.getAs("ColumnName").asInstanceOf[WrappedArray[List[List[Any]]]] .map(m=>m:+f.getAs("AssetId").toString)。这是我的整个代码:

val ficRDDResult = ficDataFrameSelect.map(

        f => {

          if(f.getAs("ListOfRficAction")!=null){
            var listActions = f.getAs("ListOfRficAction").asInstanceOf[WrappedArray[List[List[Any]]]] .map(m=>m:+f.getAs("AssetId").toString)


              var listAttachments = listActions 
                                                 .map(m=>{
                                                   m.map(x=> {
                                                   val a = Try(x.asInstanceOf[List[Any]])
                                                   if(a.isSuccess)
                                                     x
                                                   else
                                                     null
                                                 }).filter(f=>f!=null).map(x=>x.asInstanceOf[List[Any]])
                                                 })
                                                 .flatMap(f=>f)
                                             .filter(f=>f!=null)


            (listActions, listAttachments)

          }else{
            (null, null)
          }

        }).filter(f=>f._1!=null)

答案 2 :(得分:-3)

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.renault.qualite</groupId>
    <version>2.0.4</version>
    <name>qualite_spark</name>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <maven.compiler.plugin>3.5.1</maven.compiler.plugin>
        <maven.surefire.plugin>2.19.1</maven.surefire.plugin>
        <maven.assembly.plugin>3.0.0</maven.assembly.plugin>
        <scala.maven.plugin>3.2.1</scala.maven.plugin>

        <elasticsearch.version>2.4.2</elasticsearch.version>
        <elasticsearch.spark.version>2.4.2</elasticsearch.spark.version>
        <spark.version>1.6.2.2.5.3.0-37</spark.version>
        <hbase.version>1.1.2.2.5.3.0-37</hbase.version>
        <shc.version>1.1.2-1.6-s_2.10</shc.version>
        <biojava3.version>3.0</biojava3.version>
        <jsqlparser.version>0.9.5</jsqlparser.version>

        <encoding>UTF-8</encoding>
    </properties>

    <dependencies>
        <!-- ElasticSearch shaded -->
        <dependency>
            <groupId>com.renault.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>${elasticsearch.version}</version>
            <exclusions>
                <exclusion>
                    <artifactId>netty</artifactId>
                    <groupId>io.netty</groupId>
                </exclusion>
            </exclusions>
        </dependency>
                <dependency>
            <groupId>org.scala-lang.modules</groupId>
            <artifactId>scala-xml_2.11</artifactId>
            <version>1.0.2</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch-yarn</artifactId>
            <version>${elasticsearch.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch-spark_2.10</artifactId>
            <version>${elasticsearch.spark.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.apache.spark</groupId>
                    <artifactId>spark-sql_2.10</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!-- add the License jar as a dependency -->
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch-license-plugin</artifactId>
            <version>1.0.0</version>
            <scope>runtime</scope>
        </dependency>
        <!-- Spark -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>${spark.version}</version>
            <scope>compile</scope>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.10</artifactId>
            <version>${spark.version}</version>
            <scope>compile</scope>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>       
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>${spark.version}</version>
            <scope>compile</scope>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.10</artifactId>
            <version>${spark.version}</version>
            <scope>provided</scope>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- Hbase -->
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-server</artifactId>
            <version>${hbase.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
                <exclusion>
                    <artifactId>netty</artifactId>
                    <groupId>io.netty</groupId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>${hbase.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-common</artifactId>
            <version>${hbase.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>eu.unicredit</groupId>
            <artifactId>hbase-rdd_2.10</artifactId>
            <version>0.8.0</version>
        </dependency>

        <dependency>
            <groupId>org.json4s</groupId>
            <artifactId>json4s-jackson_2.10</artifactId>
            <version>3.2.10</version>
        </dependency>

        <!-- zhzhan -->
        <dependency>
            <groupId>com.hortonworks</groupId>
            <artifactId>shc-core</artifactId>
            <version>${shc.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
                <exclusion>
                    <artifactId>netty</artifactId>
                    <groupId>io.netty</groupId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- Biojava3 -->
        <dependency>
            <groupId>org.biojava</groupId>
            <artifactId>biojava3-core</artifactId>
            <version>${biojava3.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- java sql parser -->
        <dependency>
            <groupId>com.github.jsqlparser</groupId>
            <artifactId>jsqlparser</artifactId>
            <version>${jsqlparser.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!-- suppression de cette dépendance, non présente sur maven central <dependency> 
            <groupId>javax.jms</groupId> <artifactId>jms</artifactId> <version>1.1</version> 
            </dependency> -->
        <dependency>
            <groupId>javax.jms</groupId>
            <artifactId>javax.jms-api</artifactId>
            <version>2.0.1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- sicg -->
        <dependency>
            <groupId>sicg</groupId>
            <artifactId>sicg</artifactId>
            <version>3.4.1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- Commons-Mail -->
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-email</artifactId>
            <version>1.3.2</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- Spark-CSV -->
        <dependency>
            <groupId>com.databricks</groupId>
            <artifactId>spark-csv_2.10</artifactId>
            <version>1.4.0</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- Spark-AVRO -->
        <dependency>
            <groupId>com.databricks</groupId>
            <artifactId>spark-avro_2.10</artifactId>
            <version>2.0.1</version>
        </dependency>       

        <!-- HBase write -->
        <dependency>
            <groupId>it.nerdammer.bigdata</groupId>
            <artifactId>spark-hbase-connector_2.10</artifactId>
            <version>1.0.3</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- Import dans le dépôt local via la commande : mvn install:install-file 
            -Dfile=connector.jar -DgroupId=com.ibm -Dversion=1 -DartifactId=connector 
            -Dpackaging=jar -DlocalRepositoryPath=D:\Dev\Apps\workspace\sparkODI\repo -->

        <!-- Oracle JDBC -->
        <dependency>
            <groupId>oracle.jdbc</groupId>
            <artifactId>ojdbc</artifactId>
            <version>6</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- MySQL JDBC -->
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.38</version>
        </dependency>

        <!-- MS SQL JDBC -->
            <dependency>
                 <groupId>com.microsoft.sqlserver</groupId>
                <artifactId>sqljdbc4</artifactId>
                <version>4.0</version>
            </dependency>
        <!-- hive-jdbc 
            <dependency>
                <groupId>org.apache.hive</groupId>
                <artifactId>hive-jdbc</artifactId>
                <version>2.1.1</version>
            </dependency-->
        <!-- JMS for IBM MQ -->
        <dependency>
            <groupId>com.ibm</groupId>
            <artifactId>ibm-mq</artifactId>
            <version>1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>com.ibm</groupId>
            <artifactId>ibm-mq-pcf</artifactId>
            <version>1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>com.ibm</groupId>
            <artifactId>ibm-mqbind</artifactId>
            <version>1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>com.ibm</groupId>
            <artifactId>ibm-mqjms</artifactId>
            <version>1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>com.ibm</groupId>
            <artifactId>connector</artifactId>
            <version>1</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

<!--
        <dependency>
            <groupId>com.esotericsoftware.kryo</groupId>
            <artifactId>kryo</artifactId>
            <version>2.21</version>
            <exclusions>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
-->
        <dependency>
            <groupId>commons-codec</groupId>
            <artifactId>commons-codec</artifactId>
            <version>1.10</version>
        </dependency>

        <!-- scala -->
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>2.10.6</version>
        </dependency>

        <!-- Spark Xml -->
        <dependency>
            <groupId>com.databricks</groupId>
            <artifactId>spark-xml_2.10</artifactId>
            <version>0.4.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.10</artifactId>
            <version>${spark.version}</version>
            <scope>provided</scope>
        </dependency>

        <!-- Apache HttpClient -->
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.2</version>
        </dependency>

    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.apache.lucene</groupId>
                <artifactId>lucene-core</artifactId>
                <version>5.5.0</version>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <!-- list of other repositories -->
    <repositories>
        <!-- repository added for the ODI stream project -->
        <repository>
            <id>project.local</id>
            <name>project</name>
            <url>file:${project.basedir}/repo</url>
        </repository>
        <repository>
            <id>hortonworks-public</id>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
            <url>http://repo.hortonworks.com/content/groups/public/</url>
        </repository>

        <repository>
            <id>hortonworks-nexus</id>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
            <url>http://nexus-private.hortonworks.com:8081/nexus/content/repositories/IN-QA/</url>
        </repository>

        <repository>
            <id>hortonworks</id>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
            <url>http://repo.hortonworks.com/content/repositories/releases/</url>
        </repository>

        <repository>
            <id>grails</id>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
            <url>http://repo.grails.org/grails/repo/</url>
        </repository>

        <repository>
            <id>central</id>
            <url>http://central.maven.org/maven2/</url>
            <releases>
                <enabled>true</enabled>
            </releases>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
        <repository>
            <id>repo1</id>
            <url>http://repo1.maven.org/maven2</url>
            <releases>
                <enabled>true</enabled>
            </releases>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
        <repository>
            <id>elasticsearch-releases</id>
            <url>https://maven.elasticsearch.org/releases</url>
            <releases>
                <enabled>true</enabled>
            </releases>
            <snapshots>
                <enabled>true</enabled>
            </snapshots>
        </repository>
        <repository>
            <id>SparkPackagesRepo</id>
            <name>SparkPackagesRepo</name>
            <url>http://dl.bintray.com/spark-packages/maven</url>
        </repository>
        <repository>
            <id>jsqlparser-snapshots</id>
            <snapshots>
                <enabled>true</enabled>
            </snapshots>
            <url>https://oss.sonatype.org/content/groups/public/</url>
        </repository>

    </repositories>

    <pluginRepositories>
        <pluginRepository>
            <id>java.net</id>
            <name>java.net</name>
            <url>http://download.java.net/maven/2</url>
        </pluginRepository>
    </pluginRepositories>

    <build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <testSourceDirectory>src/test/scala</testSourceDirectory>
        <plugins>

            <plugin>
                <!-- see http://davidb.github.com/scala-maven-plugin -->
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>${scala.maven.plugin}</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                        <configuration>
                            <args>
                                <arg>-dependencyfile</arg>
                                <arg>${project.build.directory}/.scala_dependencies</arg>
                            </args>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>${maven.surefire.plugin}</version>
                <configuration>
                    <useFile>false</useFile>
                    <disableXmlReport>true</disableXmlReport>
                    <!-- If you have classpath issue like NoDefClassError,... -->
                    <!-- useManifestOnlyJar>false</useManifestOnlyJar -->
                    <includes>
                        <include>**/*Test.*</include>
                        <include>**/*Suite.*</include>
                    </includes>
                </configuration>
            </plugin>

            <!-- "package" command plugin -->

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.4.3</version>
                <executions>
                    <!-- Run shade goal on package phase -->
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <transformers>
                                <!-- add Main-Class to manifest file -->
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>Main</mainClass>
                                </transformer>
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>META-INF/services/org.apache.lucene.codecs.Codec</resource>
                                </transformer>
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>META-INF/services/org.apache.lucene.codecs.DocValuesFormat</resource>
                                </transformer>
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>META-INF/services/org.apache.lucene.codecs.PostingsFormat</resource>
                                </transformer>
                            </transformers>
                            <shadedArtifactAttached>true</shadedArtifactAttached>
                            <shadedClassifierName>jar-with-dependencies</shadedClassifierName>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <!-- Additional configuration. -->
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>${maven.compiler.plugin}</version>
                <configuration>
                    <source>${maven.compiler.source}</source>
                    <target>${maven.compiler.target}</target>
                </configuration>
            </plugin>

        </plugins>
    </build>
    <artifactId>dlq_spark</artifactId>
</project>