无法实例化org.apache.spark.api.java.JavaSparkContext

时间:2018-06-05 23:24:54

标签: java spring apache-spark spring-boot apache-spark-sql

我正在使用Springboot和CassandaraDB进行Apache Spark集成。我提供了以下配置文件来配置spark和cassandra db。

bash-4.4$ vagrant status
Current machine states:

machine1                  running (virtualbox)
machine2                  running (virtualbox)
machine3                  not created (virtualbox)

这是我的服务:

    @Configuration
    @PropertySource("classpath:application.properties")
    public class SparkConfig {


 private static Logger log = LoggerFactory.getLogger(DashboardRestServicesApplication.class.getName());

 @Value("${spark.master}")
 private String sparkMaster;

 @Value("${spring.data.cassandra.keyspace-name}")
 private String cassandraKeyspace;

 @Value("${cassandra.table}")
 private String cassandraTable;

 @Value("${spring.data.cassandra.contact-points}")
 private String cassandraHost;

 @Value("${spring.data.cassandra.port}")
 private String cassandraPort;

 @Value("${spring.data.cassandra.username}")
 private String username;

 @Value("${spring.data.cassandra.password}")
 private String password;


 @Bean
 public SparkConf sparkConf() {
     SparkConf conf = new SparkConf(true)
             .set("spark.cassandra.connection.host",cassandraHost)
             .set("spark.cassandra.connection.port", cassandraPort)
             .set("spark.cassandra.auth.username", username)
             .set("spark.cassandra.auth.password", password)
             .set("spark.submit.deployMode", "client")
             .setMaster(sparkMaster)
             .setAppName("DashboardSparkService");
     return conf;
 }

 @Bean
 public JavaSparkContext javaSparkContext() {
     log.info("Connecting to spark with master Url: {}, and cassandra host: {}",
                sparkMaster, cassandraHost);

     JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf());

     log.debug("spark context created");

     return javaSparkContext;
 }


 @Bean
 public SparkSession sparkSession() {
     return SparkSession
             .builder()
             .sparkContext(javaSparkContext().sc())
             .appName("DashboardSparkService")
             .getOrCreate();
 }

 }

这是我的回购:

  @Autowired
private JavaSparkContext javaSparkContext;

@Autowired
private TaskSummarySparkRepo taskSummarySparkRepo;

public int getAllOrders() {
    //JavaSparkContext javaSparkContext = sparkConfig.javaSparkContext();
    return taskSummarySparkRepo.getAllOrders(javaSparkContext);
}

一切都很好,因为它的构建成功,但是当我试图运行jar时,它会出现以下错误:

    @Service                                     
    public class TaskSummarySparkRepo {

    @Value("${spring.data.cassandra.keyspace-name}")
    private String CassandraKeyspace;

    @Value("${cassandra.table}")
    private String CassandraTable;


    public int getAllOrders(JavaSparkContext javaSparkContext) {
          JavaRDD<TaskSummary> rdd = javaFunctions(javaSparkContext)
                    .cassandraTable(CassandraKeyspace, CassandraTable, 
                     mapRowTo(TaskSummary.class));

          return (int) rdd.count();
    }
}

当我尝试从sts运行它作为spring-boot应用程序时,它会出现以下错误:

    org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'taskSummarySparkService': Unsatisfied dependency expressed through field 'javaSparkContext'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'javaSparkContext' defined in class path resource [com/spectrum/dashboard/config/SparkConfig.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.apache.spark.api.java.JavaSparkContext]: Factory method 'javaSparkContext' threw exception; nested exception is java.lang.NoClassDefFoundError: org/apache/spark/network/util/ByteUnit

编辑: 这是我用于项目的pom文件依赖项:

    Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'javaSparkContext' defined in class path resource [com/spectrum/dashboard/config/SparkConfig.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.apache.spark.api.java.JavaSparkContext]: Factory method 'javaSparkContext' threw exception; nested exception is java.lang.NoClassDefFoundError: org/spark_project/guava/collect/MapMaker        

配置是否有任何问题,或者是因为依赖注入有问题? 任何帮助表示赞赏。

1 个答案:

答案 0 :(得分:0)

当我删除现有的maven存储库然后执行maven install以及机​​器重启时,问题得到了解决。