我想从数据库中读取两列,将它们按第一列分组,然后使用Spark将结果插入另一个表中。我的程序是用Java编写的。我尝试了以下方法:
$('.question').html(function(){
var json = JSON.parse($(this).attr('data-infos'));
if(json.parent_id === 0){
return 'it ok';
}
});
这给了我错误:
public static void aggregateSessionEvents(org.apache.spark.SparkContext sparkContext) {
com.datastax.spark.connector.japi.rdd.CassandraJavaPairRDD<String, String> logs = javaFunctions(sparkContext)
.cassandraTable("dove", "event_log", mapColumnTo(String.class), mapColumnTo(String.class))
.select("session_id", "event");
logs.groupByKey();
com.datastax.spark.connector.japi.CassandraJavaUtil.javaFunctions(logs).writerBuilder("dove", "event_aggregation", null).saveToCassandra();
sparkContext.stop();
}
我的依赖关系是:
The method cassandraTable(String, String, RowReaderFactory<T>) in the type SparkContextJavaFunctions is not applicable for the arguments (String, String, RowReaderFactory<String>, mapColumnTo(String.class))
我该如何解决这个问题?
答案 0 :(得分:1)
改变这个:
C:\Program Files (x86)\company\Campaign_Analyze>"C:\Program Files (x86)\sintec\Ca
mpaign_Analyze\dist\test_zone_A_main.exe"
Traceback (most recent call last):
File "test_zone_A_main.py", line 9, in <module>
File "calcs_performence.pyc", line 11, in <module>
File "test_config_cs.pyc", line 11, in <module>
File "pandas\__init__.pyc", line 13, in <module>
ImportError: C extension: DLL load failed: The specified module could not be found. not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace' to build the C extensions first.
到:
.cassandraTable("dove", "event_log", mapColumnTo(String.class), mapColumnTo(String.class))
你发送了额外的论据。
答案 1 :(得分:0)
要按字段对数据进行分组,请执行以下步骤:
之后,可以将数据插入另一个表中或用于进一步处理。
public static void aggregateSessionEvents(SparkContext sparkContext) {
JavaRDD<Data> datas = javaFunctions(sparkContext).cassandraTable("test", "data",
mapRowTo(Data.class));
JavaPairRDD<String, String> pairDatas = datas
.mapToPair(data -> new Tuple2<>(data.getKey(), data.getValue()));
pairDatas.reduceByKey((value1, value2) -> value1 + "," + value2);
sparkContext.stop();
}