GraphX的Java示例

时间:2014-03-21 13:50:31

标签: apache-spark

在哪里可以找到Java中GraphX example的等价物?例如,以下内容如何翻译:

val users: RDD[(VertexId, (String, String))] =
sc.parallelize(Array((3L, ("rxin", "student")), (7L, ("jgonzal", "postdoc")),(5L, ("franklin", "prof")), (2L, ("istoica", "prof"))))

// Create an RDD for edges
val relationships: RDD[Edge[String]] = sc.parallelize(Array(Edge(3L, 7L, "collab"),    Edge(5L, 3L, "advisor"), Edge(2L, 5L, "colleague"), Edge(5L, 7L, "pi")))

// Define a default user in case there are relationship with missing user 
val defaultUser = ("John Doe", "Missing")

// Build the initial Graph
val graph = Graph(users, relationships, defaultUser)

6 个答案:

答案 0 :(得分:1)

在dev@spark.apache.org上获得回复:>还没有Java API。

users mailing list

答案 1 :(得分:1)

仍然没有适用于Java的API: " GraphX只能从Scala API获得。"

来源:https://forums.databricks.com/questions/3185/i-am-looking-for-some-tutorial-on-graphx-using-jav.html

答案 2 :(得分:1)

GraphX仅在Scala中可用。您可以查看Graphframe,他们正在寻找Dataframes(Java,Python,Scala) - 而不是低级RDD。它具有更多优势,因为它可以利用查询优化器Catalyst,Tungsten项目优化。

您可以将GraphX转换为Graphframes,反之亦然。 VertexIds属于任何类型,与GraphX只有Long不同。返回类型是Dataframe或Graphframe,但在GraphX中只有Graph [VD,ED],RDD。

答案 3 :(得分:0)

根据[SPARK-3665] Java API for GraphX - ASF JIRA:"目标版本:s 1.3.0"

答案 4 :(得分:0)

虽然没有官方或非官方文档,但有graphx library for Java。查看thisthis

答案 5 :(得分:0)

您可以这样使用它:

    public static void main(String[] args) {


        SparkSession spark = SparkSession
                .builder()
                .appName("javaGraphx")
                .getOrCreate();

        List<Tuple2<Object, Data>> vectorList = Arrays.asList(
                new Tuple2[]{
                        new Tuple2(1L, new Data("A", 1)),
                        new Tuple2(2L, new Data("B", 1)),
                        new Tuple2(3L, new Data("C", 1))}
        );

        JavaRDD<Tuple2<Object, Data>> users = spark.sparkContext().parallelize(
                JavaConverters.asScalaIteratorConverter(vectorList.iterator()).asScala()
                        .toSeq(), 1, scala.reflect.ClassTag$.MODULE$.apply(Tuple2.class)
        ).toJavaRDD();

        List<Edge<Integer>> edgeList = Arrays.asList(
                new Edge[]{
                        new Edge(1L, 2L, 1),
                        new Edge(2L, 3L, 1)});

        JavaRDD<Edge<Integer>> followers = spark.sparkContext().parallelize(
                JavaConverters.asScalaIteratorConverter(edgeList.iterator()).asScala()
                        .toSeq(), 1, scala.reflect.ClassTag$.MODULE$.apply(Edge.class)
        ).toJavaRDD();

        // construct graph
        Graph<OddRange, Integer> followerGraph =
                GraphImpl
                        .apply(
                                users.rdd(),
                                followers.rdd(),
                                new Data("OVER", 0),
                                StorageLevel.MEMORY_AND_DISK(),
                                StorageLevel.MEMORY_AND_DISK(),
                                scala.reflect.ClassTag$.MODULE$.apply(Data.class),
                                scala.reflect.ClassTag$.MODULE$.apply(Integer.class)
                            );
//If you wanna use pregel, you can use it like this.
        Graph<OddRange, Integer> dd = Pregel.apply(followerGraph,
                new Data("", 0),
                5,
                EdgeDirection.Out(),
                new Vprog(),
                new SendMsg(),
                new MergemMsg(),
                scala.reflect.ClassTag$.MODULE$.apply(OddRange.class),
                scala.reflect.ClassTag$.MODULE$.apply(Integer.class),
                scala.reflect.ClassTag$.MODULE$.apply(OddRange.class)
        );

        RDD<OddRange> r = dd
                .vertices()
                .toJavaRDD()
                .map(t->t._2)
                .rdd();

        r.toJavaRDD().foreach(
                o->{
                    System.out.println(o.getS());
                }
                );
    }

//If you wanna use pregel, you can use it like this.
    static class Vprog extends AbstractFunction3< Object, Data, Data, Data> implements Serializable {
        @Override
        public Data apply(Object l, Data self, Data sumOdd) {
            System.out.println(l + "---" + self.getS()+self.getI()+" ---> "+sumOdd.getS()+sumOdd.getI());
                self.setS(sumOdd.getS() + self.getS());
                self.setI(self.getI() + sumOdd.getI());
            System.out.println(l + "---" + self.getS()+self.getI()+" ---> "+sumOdd.getS()+sumOdd.getI());
                //Don't just return self here, return a new one;
                return new Data(self.getS(), self.getI());
        }
    }

    static class SendMsg extends AbstractFunction1<EdgeTriplet<Data,Integer>, scala.collection.Iterator<Tuple2<Object, Data>>> implements Serializable {
        @Override
        public scala.collection.Iterator<Tuple2<Object, Data>> apply(EdgeTriplet<Data,Integer> t) {
            System.out.println(t.srcId()+" ---> "+t.dstId()+" with: "+t.srcAttr().getS()+t.srcAttr().getI()+" ---> "+t.dstAttr().getS()+t.dstAttr().getI());

            if(t.srcAttr().getI() <= 8){
                List<Tuple2<Object, Data>> data = new ArrayList();
                data.add(new Tuple2<>( t.dstId(), new Data(t.srcAttr().getS(), t.srcAttr().getI())));
                return JavaConverters.asScalaIteratorConverter(data.iterator()).asScala();
            }else{
                return JavaConverters.asScalaIteratorConverter(new ArrayList<Tuple2<Object, Data>>().iterator()).asScala();
            }
        }
    }

    static class MergemMsg extends AbstractFunction2< Data, Data, Data> implements Serializable {
        @Override
        public Data apply(Data a, Data b) {
            return new Data( "" + a.getS() + b.getS(), a.getI() + b.getI());
        }
    }