Maven Shade插件和DataNucleus问题

时间:2017-04-02 18:53:37

标签: spring maven hive datanucleus maven-shade-plugin

我正在尝试从我的eclipse IDE中执行一个有效的代码,我面临着一些我无法处理的奇怪错误。 试图总结我的问题:

  • 用eclipse执行我的代码:一切都很好。
  • 捕获eclipse抛出的命令行来运行我的应用程序,并将其复制到shell中:一切都很好。

现在,eclipse生成的用于运行我的应用程序的命令行类似于java -cp lots-of-jars -Dvm.params myPackage.MyMainClass app-params

我的目标是使用Oozie作为Java Action执行我的应用程序,因此我需要构建一个超级jar来将 lot-of-jars 减少到 myapp.jar

为此,我配置了maven shade插件,如下所示:

         <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>2.4.2</version>
            <configuration>
            </configuration>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <filters>
                            <filter>
                                <artifact>*:*</artifact>
                                <excludes>
                                    <exclude>META-INF/*.SF</exclude>
                                    <exclude>META-INF/*.DSA</exclude>
                                    <exclude>META-INF/*.RSA</exclude>
                                </excludes>
                            </filter>
                        </filters>
                        <transformers>
                            <transformer
                                implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                <resource>reference.conf</resource>
                            </transformer>
                            <transformer
                                implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <mainClass>es.mycompany.bigdata.OozieAction</mainClass>
                            </transformer>


<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
<transformer implementation="org.apache.maven.plugins.shade.resource.PluginXmlResourceTransformer" />
                        </transformers>
                    </configuration>
                </execution>
            </executions>
        </plugin>

我不得不添加一些变形金刚因为我面临的一些错误启动我的应用程序(无法创建FsShell Spring对象,无法启动SparkContext ..) 顺便说一下,我的app目的是下载一些azure blob,然后放入HDFS,用Spark转换它们,最后添加到Hive表中。 我用Java开发了应用程序(包括spark部分)并使用Spring来实现。

现在,当我尝试创建一个HiveContext时,我的上一个新问题就出现了(我的火花上下文没问题,因为如果我省略了hive部分,我的应用就可以了):

@Bean
@Lazy
@Scope("singleton")
public SQLContext getSQLContext(@Autowired JavaSparkContext sparkContext) {
    return new HiveContext(sparkContext);
}

抛出的错误是:

2017-04-02 20:20:18 WARN  Persistence:106 - Error creating validator of type org.datanucleus.properties.CorePropertyValidator
ClassLoaderResolver for class "" gave error on creation : {1}
org.datanucleus.exceptions.NucleusUserException: ClassLoaderResolver for class "" gave error on creation : {1}
...
Caused by: org.datanucleus.exceptions.NucleusUserException: Persistence process has been specified to use a ClassLoaderResolver of name "datanucleus" yet this has not been found by the DataNucleus plugin mechanism. Please check your CLASSPATH and plugin specification.
        at org.datanucleus.NucleusContext.<init>(NucleusContext.java:283)
        at org.datanucleus.NucleusContext.<init>(NucleusContext.java:247)
        at org.datanucleus.NucleusContext.<init>(NucleusContext.java:225)
        at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.<init>(JDOPersistenceManagerFactory.java:416)
        at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:301)
        at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
        ... 93 more
2017-04-02 20:20:18 WARN  ExtendedAnnotationApplicationContext:550 - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'getOozieJavaAction': Unsatisfied dependency expressed through field 'sqlContext'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'getSQLContext' defined in es.mediaset.technology.bigdata.config.FlatJsonToCsvAppConfig: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.apache.spark.sql.SQLContext]: Factory method 'getSQLContext' threw exception; nested exception is java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

因为我的代码在eclipse中正确运行,并且使用这样的命令运行eclipse

/usr/java/jdk1.8.0_121/bin/java -Dmode=responsive -Dspark.master=local[*] -Dfile.encoding=UTF-8 -classpath /home/cloudera/workspace-sts/oozie-eventhub-retriever/target/classes:/home/cloudera/workspace-sts/java-framework/target/classes:/home/cloudera/.m2/repository/com/microsoft/azure/azure-storage/5.0.0/azure-storage-5.0.0.jar:<...>:/etc/hive/conf.dist es.mycompany.technology.bigdata.OozieAction json2hive

我认为我的阴影配置是错误的。但我无法理解为什么,我看不出我做错了什么......

由于

1 个答案:

答案 0 :(得分:0)

以下Stackoverflow问答回答了这个问题:see here

对于那些不了解如何&#34;合并&#34;来自datanucleus的所有plugin.xml文件,您可以使用此文件:Apache spark Hive, executable JAR with maven shade并将其粘贴到资源文件夹中。