Spark streaming 1.6.1不适用于Kinesis asl 1.6.1和asl 2.0.0-preview

时间:2016-07-04 14:47:02

标签: spark-streaming amazon-emr amazon-kinesis

我正在尝试使用Kinesis在EMR上运行spark streaming工作。 Spark 1.6.1与Kinesis ASL 1.6.1。编写一个简单的示例wordcount示例。

[OperationContract]
[WebInvoke(
    BodyStyle = WebMessageBodyStyle.Wrapped,
    Method = "POST",
    RequestFormat = WebMessageFormat.Json,
    ResponseFormat = WebMessageFormat.Json,
    UriTemplate = "Valid/{Id}")]
string ValidateUser(LogInDetail loginDetail,string Id);

public string ValidateUser(LogInDetail loginDetail,string Id)
{
    //your validation logic 
    return loginDetail.userName; //always null value
}

抛出异常

        <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
        <version>1.6.1</version>
    </dependency>


    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>amazon-kinesis-client</artifactId>
        <version>1.6.3</version>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>amazon-kinesis-producer</artifactId>
        <version>0.10.2</version>
    </dependency>

升级到2.0.0预览

java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: com/google/protobuf/ProtocolStringList
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.checkAndSubmitNextTask(ShardConsumer.java:157)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.consumeShard(ShardConsumer.java:126)

给出以下异常

        <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
        <version>2.0.0-preview</version>
    </dependency>

at org.apache.spark.streaming.kinesis.KinesisUtils $$ anonfun $ createStream $ 1.apply(KinesisUtils.scala:74)

1 个答案:

答案 0 :(得分:1)

这是由protobuf-java依赖冲突引起的。 使用mvn dependency:tree查找protobuf-java的版本,它是KCL和KPL所依赖的版本。然后去火花lib目录,你会发现另一个版本。 请使用maven-shade-plugin,并重新定位冲突类:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.3</version>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <outputFile>
                    ${project.build.directory}/${project.artifactId}-${project.version}-selfcontained.jar
                </outputFile>
                <relocations>
                    <relocation>
                        <pattern>com.google.protobuf</pattern>
                        <shadedPattern>shade.com.google.protobuf</shadedPattern>
                    </relocation>
                    <relocation>
                        <pattern>com.amazonaws</pattern>
                        <shadedPattern>shade.com.amazonaws</shadedPattern>
                    </relocation>
                </relocations>
                <filters>
                    <filter>
                        <artifact>*:*</artifact>
                        <excludes>
                            <exclude>META-INF/*.SF</exclude>
                            <exclude>META-INF/*.DSA</exclude>
                            <exclude>META-INF/*.RSA</exclude>
                        </excludes>
                    </filter>
                </filters>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
                </transformers>
            </configuration>
        </execution>
    </executions>
</plugin>