如何从aggregateMessages而不仅仅是vertexId获取实体

时间:2016-11-11 02:01:54

标签: apache-spark spark-graphx

package com.mypackage

import org.apache.spark.graphx._
import org.apache.spark.{SparkContext, SparkConf}

/**
  * Created by sidazhang on 11/8/16.
  */
case class Person(age: Int)

case class EdgeImpl()

object GraphApp {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("SparkMain").setMaster("local[1]")
    val sc = new SparkContext(conf)

    val vertices =
      sc.parallelize(Array((1L, Person(10)), (2L, Person(15)),
        (3L, Person(20)), (4L, Person(30))))

    // Create an RDD for edges
    val relationships =
      sc.parallelize(Array(Edge(2L, 1L, EdgeImpl()),
        Edge(3L, 1L, EdgeImpl()), Edge(4L, 1L, EdgeImpl())))
    val graph = Graph(vertices, relationships)

    // Compute the number of older followers and their total age
    val olderFollowers: VertexRDD[Array[Person]] = graph.aggregateMessages[Array[Person]](
      ctx => ctx.sendToDst(Array(ctx.srcAttr)),
      // Merge the array of followers
      (a, b) => a ++ b
    )

    // Here I only have the id of the person and a list of his followers.
    // How do I get the vertex of the person
    olderFollowers.collect.foreach { case (id, followers) => followers.foreach(println(id, _)) }
  }
}

问题是通过aggregateMessage API,我最终得到了vertexId。如何获得实际顶点。

(问题是内联)

1 个答案:

答案 0 :(得分:0)

您必须将其与原始数据一起加入:

graph.joinVertices(olderFollowers)(someMergingFunction).vertices