您好我已经构建了多个多图(总共11个)
ex:Graph 1 - SongArtist - SongVertex(Id,SongName)ArtistVertex(Id,ArtistName,NetWorth)Edge(歌曲,艺术家,“Sung”)
图2 - SongWriter - SongVertex(Id,SongName)WriterVertex(Id,ArtistName)Edge(歌曲,作家,“WrittenBy”)
图3 - ArtistWriter- ArtistVertex(Id,ArtistName,NetWorth)WriterVertex(Id,ArtistName)Edge(Artist,Writer,“Collaborated”) ...
我希望能够将所有这些合并在一起以形成一个图形。 Graph1和Graph2可以在Song和Graph2上合并,Graph3可以在Writer和Graph1上合并,Graph3可以在Artist上合并。
某些图表具有由案例类定义的边缘属性和顶点属性。以下显示了Graph3的开发方式。其他人或多或少地遵循相同的结构:
case class ArtistWriterProperties(weight: String, edgeType: String) extends EdgeProperty
case class ArtistProperty(val vertexType: String, val artistName: String, val netWorth: String) extends VertexProperty
case class WriterProperty(val vertexType: String, val writerName: String) extends VertexProperty
val ArtistWriter: RDD[(VertexId, VertexProperty)] = sc.textFile(vertexArtistWriter).map {
line =>
val row = line.split(",")
val id = row(0).toLong
val vertexType = row(1)
val prop = vertexType match {
case "Artist" => ArtistProperty(vertexType, row(2), row(3))
case "Writer" => WriterProperty(vertexType, row(2))
}
(id, prop)
}
val edgesArtistWriterCollaborated: RDD[Edge[EdgeProperty]] = sc.textFile(edgeWeightedArtistWriterCollaborated).map {
line =>
val row = line.split(",")
Edge(row(0).toLong, row(1).toLong, ArtistWriterProperties(row(2), row(3)))
}
val graph3 = Graph(ArtistWriter, edgesArtistWriterCollaborated)
我正在尝试这种方式:
val graph2And3 = Graph(
graph2.vertices.union(graph3.vertices),
graph2.edges.union(graph3.edges)
).partitionBy(RandomVertexCut).
groupEdges( (attr1, attr2) => attr1 + attr2 )
但我收到错误 - 输入不匹配
答案 0 :(得分:1)
所以基本上你需要为顶点执行join
,为边缘执行union
。
对于每个图,您可以获得顶点的RDD和边的RDD。
1)按所需键的顺序full outer join
顶点RDD,并为最终顶点创建新ID,例如graph1.vertexes.fullOuterJoin(graph2.vertexes, "SongArtist").fullOuterJoin...
2)联合边的所有RDD然后你可以从顶点的新RDD和边的RDD创建图。