我开始学习一些Spark \ Scala \ GraphX与Pregel一起使用,我在这里找到了一些简单的代码: http://www.cakesolutions.net/teamblogs/graphx-pregel-api-an-example 并想尝试一下。 所以我试着按照我的想法编译这段代码(这是我第一次使用Scala):
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
class Graph[VD, ED] (
val vertices: VertexRDD[VD],
val edges: EdgeRDD[ED]) {
}
object SimpleApp {
val initialMsg = 9999
def vprog(vertexId: VertexId, value: (Int, Int), message: Int): (Int, Int) = {
if (message == initialMsg)
value
else
(message min value._1, value._1)
}
def sendMsg(triplet: EdgeTriplet[(Int, Int), Boolean]): Iterator[(VertexId, Int)] = {
val sourceVertex = triplet.srcAttr
if (sourceVertex._1 == sourceVertex._2)
Iterator.empty
else
Iterator((triplet.dstId, sourceVertex._1))
}
def mergeMsg(msg1: Int, msg2: Int): Int = msg1 min msg2
def pregel[A]
(initialMsg: A,
maxIter: Int = Int.MaxValue,
activeDir: EdgeDirection = EdgeDirection.Out)
(vprog: (VertexId, VD, A) => VD,
sendMsg: EdgeTriplet[VD, ED] => Iterator[(VertexId, A)],
mergeMsg: (A, A) => A)
: Graph[VD, ED]
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val minGraph = graph.pregel(initialMsg,
Int.MaxValue,
EdgeDirection.Out)(
vprog,
sendMsg,
mergeMsg)
minGraph.vertices.collect.foreach{
case (vertexId, (value, original_value)) => println(value)
}
sc.stop()
}
}
但是我收到了这个错误:
$ C:\"Program Files (x86)"\sbt\bin\sbt package
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
[info] Set current project to Simple Project (in build file:/C:/spark/simple/)
[info] Compiling 1 Scala source to C:\spark\simple\target\scala-2.10\classes...
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:42: not found: type VD
[error] : Graph[VD, ED]
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:42: not found: type ED
[error] : Graph[VD, ED]
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:39: not found: type VD
[error] (vprog: (VertexId, VD, A) => VD,
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:39: not found: type VD
[error] (vprog: (VertexId, VD, A) => VD,
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:40: not found: type VD
[error] sendMsg: EdgeTriplet[VD, ED] => Iterator[(VertexId, A)],
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:40: not found: type ED
[error] sendMsg: EdgeTriplet[VD, ED] => Iterator[(VertexId, A)],
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:35: only classes can have declared but undefined members
[error] def pregel[A]
[error] ^
[error] C:\spark\simple\src\main\scala\SimpleApp.scala:48: not found: value graph
[error] val minGraph = graph.pregel(initialMsg,
[error] ^
[error] 8 errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 5 s, completed Jan 17, 2017 12:35:00 AM
我对Scala相当新,所以我并不完全理解这些定义中的VD \ ED角色。
答案 0 :(得分:2)
问题在于,您不能使用未在范围内某处声明为类型参数的方法定义的类型(例如,该方法或包含类等)。
查看您的方法def pregel[A]
。它返回类型Graph[VD, ED]
的值。但编译器如何知道VD
指的是什么?如果您只是添加VD
作为方法类型参数,那么解决您的代码所做的事情很容易解决这个问题:
def pregel[A, VD]
请注意,如果pregel
是类Graph
中的方法,则可以正常,因为Graph
定义了该类型:class Graph[VD, ED]
。根据您发布的代码,您的方法似乎在不知名的地方挥之不去,这是不允许的 - 也许您可能想考虑将它们移到Graph
类中?