使用lambda步骤的groupCount在使用SparkGraphComputer的OLAP中失败

时间:2017-03-06 18:04:57

标签: apache-spark titan gremlin

我想在图表中某类节点上可用的特定属性值的转换上计算groupCount。似乎与graph.traversal()一起正常工作,但不适用于SparkGraphComputer

摘录一般遍历:

gremlin> g = graph.traversal()
gremlin> g.V().hasLabel('webSession').values('channelSessionId')
            .map { it.toString().substring(0, 5)}
            .groupCount().order(local).by(values, decr)
==>[013b7:15,03190:11,04132:10,08c6a:10,028aa:9,005fe:9,09217:9,
    03ee8:9,0618d:9,0a0d3:9,079d5:9,055c5:9,05b15:9,
    0068b:8,005c0:8,03d30:8,009f8:8,07561:8,00d90:8,
    07794:8,066b9:8,09e13:8,09e9c:8,057b7:8,04781:8,...]

摘自SparkGraphComputer:

gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
gremlin> g.V().hasLabel('webSession').values('channelSessionId')
            .map { it.toString().substring(0, 5)}
            .groupCount().order(local).by(values, decr)
org.apache.spark.SparkException: Task not serializable

如何将此查询转换为OLAP查询?

另一个相关的问题:如果我只对groupCount Map<>对象中的前5个元素感兴趣,有没有办法解决这个问题?

0 个答案:

没有答案