GraphX边缘具有多个功能

时间:2018-06-06 22:24:36

标签: apache-spark spark-graphx

我正在使用Graphx并尝试向边缘添加功能。我有一个ids,Id2,Weight,Type

的csv文件

我能够获得ID和一个功能 - 重量或类型。有没有办法为边缘保存多个功能。以下是我的代码片段:

val edgesWriterWriterCollaborated: RDD[Edge[String]] = sc.textFile(edgeWeightedWriterWriterCollaborated).map {
  line =>
    val row = line.split(",")
    Edge(row(0).toLong, row(1).toLong, row(2))
}

这给了我一个错误:

val edgesWriterWriterCollaborated: RDD[Edge[Tuple2]] = sc.textFile(edgeWeightedWriterWriterCollaborated).map {
  line =>
    val row = line.split(",")
    Edge(row(0).toLong, row(1).toLong, (row(2), row(3)))
}

更新

我修复了我的代码:

    case class WriterWriterProperties(weight: String, edgeType: String)
 val edgesWriterWriterCollaborated: RDD[Edge[WriterWriterProperties]] = sc.textFile(edgeWeightedWriterWriterCollaborated).map {
  line =>
    val row = line.split(",")
    Edge(row(0).toLong, row(1).toLong, WriterWriterProperties(row(2), row(3)))
}

然而,当我尝试打印时:

   graph4.triplets.foreach(println)

我收到错误:Caused by: java.io.NotSerializableException

1 个答案:

答案 0 :(得分:0)

不确定。使用Tuple2

Edge(row(0).toLong, row(1).toLong, (row(2), row(3)))

或在您的案例中有意义的任何特定于域的对象:

case class FooBar(foo: String, bar: String)

Edge(row(0).toLong, row(1).toLong, FooBar(row(2), row(3)))