Giraph:将文本用作VertexId

时间:2014-10-15 11:18:51

标签: giraph

我试着测试Giraph。

VertexId类型文字

输入边缘基础

如果我使用Text作为VertexId,我会收到错误。如果LongWritable,一切都还可以。

问题: 1.使用Text作为VertexId可以吗? 如果是的话,我在做什么呢?

错误

14/10/15 14:59:28 INFO worker.InputSplitsCallable: call: Loaded 1 input splits in 0.08243016 secs, (v=0, e=12) 0.0 vertices/sec, 145.57777 edges/sec
14/10/15 14:59:28 ERROR utils.LogStacktraceCallable: Execution of callable failed
java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at org.apache.giraph.utils.UnsafeArrayReads.readFully(UnsafeArrayReads.java:103)
    at org.apache.hadoop.io.Text.readFields(Text.java:265)
    at org.apache.giraph.utils.ByteStructVertexIdDataIterator.next(ByteStructVertexIdDataIterator.java:65)
    at org.apache.giraph.edge.AbstractEdgeStore.addPartitionEdges(AbstractEdgeStore.java:161)

自定义格式:

public class TextDoubleTextEdgeInputFormat extends
    TextEdgeInputFormat<Text, DoubleWritable> {
  /** Splitter for endpoints */
  private static final Pattern SEPARATOR = Pattern.compile("[\t ]");

...

主要课程

public class HelloWorld extends
        BasicComputation<Text, Text, NullWritable, NullWritable> {

    @Override
    public void compute(Vertex<Text, Text, NullWritable> vertex,
            Iterable<NullWritable> messages) {
        System.out.print("Hello world from the: " + vertex.getId().toString()
                + " who is following:");
        for (Edge<Text, NullWritable> e : vertex.getEdges()) {
            System.out.print(" " + e.getTargetVertexId());
        }
        System.out.println("");
        vertex.voteToHalt();

    }

    public static void main(String[] args) throws Exception {
        System.exit(ToolRunner.run(new GiraphRunner(), args));
    }
}

测试初学者

@Test
public void textDoubleTextEdgeInputFormatTest() throws Exception {

    String[] graph = { "1 2 1.0", "2 1 1.0", "1 3 1.0", "3 1 1.0",
            "2 3 2.0", "3 2 2.0", "3 4 2.0", "4 3 2.0", "3 5 1.0",
            "5 3 1.0", "4 5 1.0", "5 4 1.0" };

    GiraphConfiguration conf = new GiraphConfiguration();
    conf.setComputationClass(HelloWorld.class);
    conf.setEdgeInputFormatClass(TextDoubleTextEdgeInputFormat.class);
    // conf.setEdgeInputFormatClass(TextLongL6TextEdgeInputFormat.class);

    // conf.setVertexOutputFormatClass(IdWithValueTextOutputFormat.class);

    InternalVertexRunner.run(conf, null, graph);
}

1 个答案:

答案 0 :(得分:0)

我在Giraph中没有经验,但在Apache SparkX中,VertexId的类型为Long。

如果在Apache Spark GraphX中重用Apache Giraph的设计模式,我不会感到惊讶。因此,我认为Long类型是可行的,因为LongWritable类型的实现是成功的。