如何在Gremlin中输出追加原因的相似之处

时间:2015-03-05 13:57:32

标签: graph gremlin tinkerpop

我有以下简单的图表:

用户 - 喜欢 - >项目

我使用以下Gremlin代码向用户发现前10位类似用户:

u.out('Likes').in('Likes').filter([u]).groupCount.cap.orderMap(T.decr)[0..10].map()

这会输出类似的内容:

==>{userid=1}
==>{userid=5}
==>{userid=10}
==>{userid=15}

我希望输出信息更丰富,并且有更多信息,例如排序地图中的排名和与原始用户共享的项目(itemid),如下所示:

==>{userid=1, rank=0, reason_items={1,2,3,5}}
==>{userid=5, rank=1, reason_items={1,2,10}}
==>{userid=10, rank=2, reason_items={1,2,4}}
==>{userid=15, rank=3, reason_items={1,2}}

一个高效的gremlin-groovy代码示例会很好!

谢谢。

2 个答案:

答案 0 :(得分:1)

通过在查询中添加适当的transform闭包:

rank = 0; itemsU1 = [] as Set; u1.out('Likes').aggregate(itemsU1).in('Likes')
      .filter{it != u1}.groupCount.cap.orderMap(T.decr)
      .transform{[id:it.id, rank:rank++, reason_item_ids:itemsU1.intersect(it.out('Likes').toSet()).collect{it.id}]}

......你可以获得:

==>{id=User6, rank=0, reason_item_ids=[Item1, Item5]}
==>{id=User4, rank=1, reason_item_ids=[Item1, Item2]}
==>{id=User2, rank=2, reason_item_ids=[Item1]}
==>{id=User5, rank=3, reason_item_ids=[Item5]}
==>{id=User3, rank=4, reason_item_ids=[Item2]}

以下示例图表:

g = new TinkerGraph()

u1 = g.addVertex('User1')
u2 = g.addVertex('User2')
u3 = g.addVertex('User3')
u4 = g.addVertex('User4')
u5 = g.addVertex('User5')
u6 = g.addVertex('User6')

i1 = g.addVertex('Item1')
i2 = g.addVertex('Item2')
i3 = g.addVertex('Item3')
i4 = g.addVertex('Item4')
i5 = g.addVertex('Item5')

g.addEdge(u1,i1,'Likes')
g.addEdge(u1,i2,'Likes')
g.addEdge(u1,i5,'Likes')
g.addEdge(u2,i1,'Likes')
g.addEdge(u2,i4,'Likes')
g.addEdge(u3,i2,'Likes')
g.addEdge(u4,i1,'Likes')
g.addEdge(u4,i2,'Likes')
g.addEdge(u4,i3,'Likes')
g.addEdge(u5,i4,'Likes')
g.addEdge(u5,i5,'Likes')
g.addEdge(u6,i1,'Likes')
g.addEdge(u6,i4,'Likes')
g.addEdge(u6,i5,'Likes')

答案 1 :(得分:1)

鉴于Faber的示例图,您可以这样做:

u = u1; m = [:].withDefault {[]}; rank = 0; key = null
u.out('Likes').as('item').in('Likes').except([u]).as('user').select().groupBy {
  key = it.getColumn('user')
} {
  m[key] << it.getColumn('item').id
} {
  it.size()
}.cap().orderMap(T.decr)[0..10].transform {[
  'userid'         : it.id,
  'rank'           : rank++,
  'reason_item_ids': m[it]
]}

.transform()内无需嵌套遍历。