我在scala中有一个列表,如下所示:
val log = List(
List("a","b","c"),
List("a","c","b","h","c"),
List("a","d","e"),
List("a","d","e","f","d","e")
)
我想创建一个这样的图形:
val vertexName: RDD[(VertexId, (String))] =
sc.parallelize(Array((1L, ("a")), (2L, ("b")),
(3L, ("c")), (4L, ("d")),
(5L, ("e")), (6L, ("f")),
(7L, ("h"))))
val edgeName: RDD[Edge[String]] =
sc.parallelize(Array(Edge(1L, 2L, "1"), Edge(2L, 3L, "1"),
Edge(1L, 3L, "1"), Edge(3L, 2L, "1"),
Edge(2L, 7L, "1"), Edge(7L, 3L, "1"),
Edge(1L, 4L, "1"), Edge(4L, 5L, "1"),
Edge(5L, 6L, "1"), Edge(6L, 4L, "1")))
val graph = Graph(vertexName, edgeName)
有可能吗?有办法吗?
答案 0 :(得分:1)
我假设您的顶点列表是应该在图形中找到的路径。
我将从在顶点名称及其VertexId之间建立映射开始
val vertices = log.flatMap(x=> x).toSet.toSeq
val vertexMap = (0 until vertices.size)
.map(i => vertices(i) -> i.toLong)
.toMap
然后,我将使用顶点贴图生成一组边(以避免重复)。
val edgeSet = log
.filter(_.size >1) // with only one vertex, this is not a path
.flatMap(list => list.indices.tail.map( i => list(i-1) -> list(i)))
.map(x => Edge(vertexMap(x._1), vertexMap(x._2), "1"))
.toSet
并创建图形:
val edges = sc.parallelize(edgeSet.toSeq)
val vertexNames = sc.parallelize(vertexMap.toSeq.map(_.swap))
val graph = Graph(vertexNames, edges)