我想运行这个程序我是新手scala spark我有“有编译错误” 任何人都可以帮我这个吗?
在这里输入代码
package main.scala.com.matthewrathbone.spark
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import scala.collection.Map
class ExampleJob(sc: SparkContext) {
// reads data from text files and computes the results. This is what you test
def run(t: String, u: String) : RDD[(String, String)] = {
val transactions = sc.textFile(t)
val newTransactionsPair = transactions.map{t =>
val p = t.split(" ")
(p(2).toInt, p(1).toInt)
}
val users = sc.textFile(u)
val newUsersPair = users.map{t =>
val p = t.split(" ")
(p(0).toInt, p(3))
}
val result = processData(newTransactionsPair, newUsersPair)
return sc.parallelize(result.toSeq).map(t => (t._1.toString, t._2.toString))
}
def processData (t: RDD[(Int, Int)], u: RDD[(Int, String)]) : Map[Int,Long] = {
var jn = t.leftOuterJoin(u).values.distinct
return jn.countByKey
}
}
object ExampleJob {
def main(args: Array[String]) {
val transactionsIn = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/transactions.txt")
val usersIn = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/users.txt")
//val transactionsIn = args(1)
// val usersIn = args(0)
val conf = new SparkConf().setAppName("SparkJoins").setMaster("local")
val context = new SparkContext(conf)
val job = new ExampleJob(context)
val results = job.run(transactionsIn, usersIn)
//val output = args(2)
val output = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/out.txt")
results.saveAsTextFile(output)
context.stop()
}
}
我尝试使用args的输入,但错误是一样的。 这段代码对spark-shell中的两个文本文件进行了一些操作 我也有时会得到第一行包装定义的错误。
提前致谢
答案 0 :(得分:1)
您的DELIMITER $$
DROP PROCEDURE IF EXISTS mass_backup $$
CREATE PROCEDURE mass_backup ()
BEGIN
DECLARE copy_table VARCHAR(255);
DECLARE done BOOLEAN DEFAULT 0;
DECLARE table_cursor CURSOR FOR
SELECT table_name FROM information_schema.tables WHERE table_schema='app_prod_schema' ORDER BY table_name;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1;
DROP TEMPORARY TABLE IF EXISTS stmtsTable;
CREATE TEMPORARY TABLE stmtsTable (backupStmt varchar(200));
OPEN table_cursor;
curloop:
LOOP
FETCH table_cursor INTO copy_table;
IF done THEN
LEAVE curloop;
END IF;
#SELECT copy_table;
SET @var1 = copy_table;
SET @copySql = CONCAT('SELECT * FROM ',@var1,' INTO OUTFILE S3 \'s3://somebucket/dbbackups/', @var1 ,
'.csv\'', ' FIELDS TERMINATED BY \',\' LINES TERMINATED BY \'\\n\'' );
INSERT INTO stmtsTable(backupStmt)
SELECT @copySql;
END LOOP;
CLOSE table_cursor;
SELECT * FROM stmtsTable;
END $$
方法需要2个参数,即2个字符串$('#form').on({
fpAjaxDone: function ( e, jqXHR, status, data ){
successFunction();
},
fpAjaxFail: function ( e, jqXHR, status, data ){
failFunction();
}
});
,但在run
方法中,您使用2 (t: String, u: String)
调用它。您希望将main
和Resource
更改为String,如下所示:
transactionsIn
我也是Scala的新手,但我认为你不应该在代码中使用usersIn
,请参阅this SO。
答案 1 :(得分:0)
我发现了我的问题, 当然参数不匹配但是我将其更改为字符串并且问题没有解决,之后我使用sbt进行打包和编译,sbt自动添加了库并且程序运行正常,谢谢你的答案