启动spark-submit我收到以下错误:
线程“main”中的异常java.lang.UnsupportedOperationException:空集合
Spark Code Class 1:
var vertices = sc.textFile("hdfs:///user/cloudera/dstlf.txt")
.flatMap { line => line.split("\\s+") }
.distinct()
vertices.map {
vertex => vertex.replace("-", "") + "\t" + vertex.saveAsTextFile("hdfs:///user/cloudera/metadata-lookup-tlf")
}
Spark Code Class 2:
val sc = new SparkContext(conf)
val graph = GraphLoader.edgeListFile(sc,"hdfs:///user/cloudera/metadata-processed-tlf")
// Run PageRank
val ranks = graph.pageRank(0.0001).vertices
由于某种原因,它似乎无法在RDD中运行
输入文件dstlf
包含2列数字:
94-92250 94-92174
94-92889 94-91869
94-94172 94-91682
...