我正在使用Springboot和CassandaraDB进行Apache Spark集成。我提供了以下配置文件来配置spark和cassandra db。
bash-4.4$ vagrant status
Current machine states:
machine1 running (virtualbox)
machine2 running (virtualbox)
machine3 not created (virtualbox)
这是我的服务:
@Configuration
@PropertySource("classpath:application.properties")
public class SparkConfig {
private static Logger log = LoggerFactory.getLogger(DashboardRestServicesApplication.class.getName());
@Value("${spark.master}")
private String sparkMaster;
@Value("${spring.data.cassandra.keyspace-name}")
private String cassandraKeyspace;
@Value("${cassandra.table}")
private String cassandraTable;
@Value("${spring.data.cassandra.contact-points}")
private String cassandraHost;
@Value("${spring.data.cassandra.port}")
private String cassandraPort;
@Value("${spring.data.cassandra.username}")
private String username;
@Value("${spring.data.cassandra.password}")
private String password;
@Bean
public SparkConf sparkConf() {
SparkConf conf = new SparkConf(true)
.set("spark.cassandra.connection.host",cassandraHost)
.set("spark.cassandra.connection.port", cassandraPort)
.set("spark.cassandra.auth.username", username)
.set("spark.cassandra.auth.password", password)
.set("spark.submit.deployMode", "client")
.setMaster(sparkMaster)
.setAppName("DashboardSparkService");
return conf;
}
@Bean
public JavaSparkContext javaSparkContext() {
log.info("Connecting to spark with master Url: {}, and cassandra host: {}",
sparkMaster, cassandraHost);
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf());
log.debug("spark context created");
return javaSparkContext;
}
@Bean
public SparkSession sparkSession() {
return SparkSession
.builder()
.sparkContext(javaSparkContext().sc())
.appName("DashboardSparkService")
.getOrCreate();
}
}
这是我的回购:
@Autowired
private JavaSparkContext javaSparkContext;
@Autowired
private TaskSummarySparkRepo taskSummarySparkRepo;
public int getAllOrders() {
//JavaSparkContext javaSparkContext = sparkConfig.javaSparkContext();
return taskSummarySparkRepo.getAllOrders(javaSparkContext);
}
一切都很好,因为它的构建成功,但是当我试图运行jar时,它会出现以下错误:
@Service
public class TaskSummarySparkRepo {
@Value("${spring.data.cassandra.keyspace-name}")
private String CassandraKeyspace;
@Value("${cassandra.table}")
private String CassandraTable;
public int getAllOrders(JavaSparkContext javaSparkContext) {
JavaRDD<TaskSummary> rdd = javaFunctions(javaSparkContext)
.cassandraTable(CassandraKeyspace, CassandraTable,
mapRowTo(TaskSummary.class));
return (int) rdd.count();
}
}
当我尝试从sts运行它作为spring-boot应用程序时,它会出现以下错误:
org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'taskSummarySparkService': Unsatisfied dependency expressed through field 'javaSparkContext'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'javaSparkContext' defined in class path resource [com/spectrum/dashboard/config/SparkConfig.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.apache.spark.api.java.JavaSparkContext]: Factory method 'javaSparkContext' threw exception; nested exception is java.lang.NoClassDefFoundError: org/apache/spark/network/util/ByteUnit
编辑: 这是我用于项目的pom文件依赖项:
Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'javaSparkContext' defined in class path resource [com/spectrum/dashboard/config/SparkConfig.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.apache.spark.api.java.JavaSparkContext]: Factory method 'javaSparkContext' threw exception; nested exception is java.lang.NoClassDefFoundError: org/spark_project/guava/collect/MapMaker
配置是否有任何问题,或者是因为依赖注入有问题? 任何帮助表示赞赏。
答案 0 :(得分:0)
当我删除现有的maven存储库然后执行maven install以及机器重启时,问题得到了解决。