有编译错误scala spark期望类或对象定义

时间:2018-04-28 19:25:54

标签: scala apache-spark

我想运行这个程序我是新手scala spark我有“有编译错误” 任何人都可以帮我这个吗?

在这里输入代码

package main.scala.com.matthewrathbone.spark
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import scala.collection.Map

class ExampleJob(sc: SparkContext) {
  // reads data from text files and computes the results. This is what you test
  def run(t: String, u: String) : RDD[(String, String)] = {
    val transactions = sc.textFile(t)
    val newTransactionsPair = transactions.map{t =>                
        val p = t.split(" ")
        (p(2).toInt, p(1).toInt)
    }

val users = sc.textFile(u)
val newUsersPair = users.map{t =>                
    val p = t.split(" ")
    (p(0).toInt, p(3))
}

val result = processData(newTransactionsPair, newUsersPair)
return sc.parallelize(result.toSeq).map(t => (t._1.toString, t._2.toString))


 } 

  def processData (t: RDD[(Int, Int)], u: RDD[(Int, String)]) : Map[Int,Long] = {
    var jn = t.leftOuterJoin(u).values.distinct
    return jn.countByKey
  }
}

object ExampleJob {
  def main(args: Array[String]) {
   val transactionsIn = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/transactions.txt")
    val usersIn = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/users.txt") 

//val transactionsIn = args(1)
  //  val usersIn = args(0)
    val conf = new SparkConf().setAppName("SparkJoins").setMaster("local")
    val context = new SparkContext(conf)
    val job = new ExampleJob(context)
    val results = job.run(transactionsIn, usersIn)
    //val output = args(2)
val output = Resource.fromFile("/home/ali/Desktop/main/scala/com/matthewrathbone/spark/out.txt")
    results.saveAsTextFile(output)
    context.stop()
  }
}

我尝试使用args的输入,但错误是一样的。 这段代码对spark-shell中的两个文本文件进行了一些操作 我也有时会得到第一行包装定义的错误。

提前致谢

2 个答案:

答案 0 :(得分:1)

您的DELIMITER $$ DROP PROCEDURE IF EXISTS mass_backup $$ CREATE PROCEDURE mass_backup () BEGIN DECLARE copy_table VARCHAR(255); DECLARE done BOOLEAN DEFAULT 0; DECLARE table_cursor CURSOR FOR SELECT table_name FROM information_schema.tables WHERE table_schema='app_prod_schema' ORDER BY table_name; DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1; DROP TEMPORARY TABLE IF EXISTS stmtsTable; CREATE TEMPORARY TABLE stmtsTable (backupStmt varchar(200)); OPEN table_cursor; curloop: LOOP FETCH table_cursor INTO copy_table; IF done THEN LEAVE curloop; END IF; #SELECT copy_table; SET @var1 = copy_table; SET @copySql = CONCAT('SELECT * FROM ',@var1,' INTO OUTFILE S3 \'s3://somebucket/dbbackups/', @var1 , '.csv\'', ' FIELDS TERMINATED BY \',\' LINES TERMINATED BY \'\\n\'' ); INSERT INTO stmtsTable(backupStmt) SELECT @copySql; END LOOP; CLOSE table_cursor; SELECT * FROM stmtsTable; END $$ 方法需要2个参数,即2个字符串$('#form').on({ fpAjaxDone: function ( e, jqXHR, status, data ){ successFunction(); }, fpAjaxFail: function ( e, jqXHR, status, data ){ failFunction(); } }); ,但在run方法中,您使用2 (t: String, u: String)调用它。您希望将mainResource更改为String,如下所示:

transactionsIn

我也是Scala的新手,但我认为你不应该在代码中使用usersIn,请参阅this SO

答案 1 :(得分:0)

我发现了我的问题, 当然参数不匹配但是我将其更改为字符串并且问题没有解决,之后我使用sbt进行打包和编译,sbt自动添加了库并且程序运行正常,谢谢你的答案