总和边属性并按传入顶点分组

时间:2019-07-12 23:37:00

标签: graph gremlin tinkerpop

我有两个顶点:sitepersoninterest-category。 边缘是:

site->hasUser->person

person->hasInterest(count: N)->interest-category(属性count保存一个数字,它是这种特定用户-兴趣关系的兴趣权重)。

基本上,我想通过使用其用户兴趣来获取与站点相关的所有兴趣。一个示例结果将是:

interest-category: news, totalCount: 15
interest-category: media, totalCount: 20

其中totalCount是该网站/类别的所有用户的边缘属性count的总和。

这是我正在使用的测试样本:

graph = TinkerGraph.open()
g = graph.traversal()

g.addV('person').property(id, 'rodrigo').property('name', 'rodrigo').next()
g.addV('person').property(id, 'john').property('name', 'john').next()
g.addV('site').property(id, 'foxsports').property('name', 'Fox Sports').next()
g.addV('interest-category').property(id, 'sports').property('name', 'Sports').property('level', 'l1').next()
g.addV('interest-category').property(id, 'media').property('name', 'Media').property('level', 'l1').next()
g.addV('interest-category').property(id, 'business-and-finance').property('name', 'Business & Finance').property('level', 'l1').next()
g.addV('interest-category').property(id, 'soccer').property('name', 'Soccer').property('level', 'l2').next()
g.addV('interest-category').property(id, 'basketball').property('name', 'Basketball').property('level', 'l2').next()
g.addV('interest-category').property(id, 'mma').property('name', 'MMA').property('level', 'l2').next()
g.addV('interest-category').property(id, 'news').property('name', 'News').property('level', 'l2').next()
g.addV('interest-category').property(id, 'finance').property('name', 'Finance').property('level', 'l2').next()
g.addV('interest-category').property(id, 'sports-industry').property('name', 'Sports Industry').property('level', 'l2').next()


g.addE('hasUser').from(g.V('foxsports')).to(g.V('rodrigo')).next()
g.addE('hasUser').from(g.V('foxsports')).to(g.V('john')).next()
g.addE('hasSubCategory').from(g.V('sports')).to(g.V('soccer')).next()
g.addE('hasSubCategory').from(g.V('sports')).to(g.V('basketball')).next()
g.addE('hasSubCategory').from(g.V('sports')).to(g.V('mma')).next()
g.addE('hasSubCategory').from(g.V('media')).to(g.V('news')).next()
g.addE('hasSubCategory').from(g.V('business-and-finance')).to(g.V('finance')).next()
g.addE('hasSubCategory').from(g.V('business-and-finance')).to(g.V('sports-industry')).next()
g.addE('hasInterest').from(g.V('john')).to(g.V('sports')).property('count', 5).next()
g.addE('hasInterest').from(g.V('rodrigo')).to(g.V('sports')).property('count', 5).next()
g.addE('hasInterest').from(g.V('rodrigo')).to(g.V('media')).property('count', 3).next()
g.addE('hasInterest').from(g.V('rodrigo')).to(g.V('business-and-finance')).property('count', 1).next()
g.addE('hasInterest').from(g.V('rodrigo')).to(g.V('soccer')).property('count', 1).next()
g.addE('hasInterest').from(g.V('rodrigo')).to(g.V('basketball')).property('count', 1).next()
g.addE('hasInterest').from(g.V('rodrigo')).to(g.V('mma')).property('count', 2).next()

1 个答案:

答案 0 :(得分:3)

我不知道您如何从提供的示例图到此输出:

interest-category: news, totalCount: 15
interest-category: media, totalCount: 20

但是,按照您的普通英语描述,我会说您想要此查询:

gremlin> g.V('foxsports').
           out('hasUser').
           outE('hasInterest').
           group().
             by(inV().values('name')).
             by(values('count').sum())
==>[MMA:2,Business & Finance:1,Soccer:1,Media:3,Basketball:1,Sports:10]

重新格式化:

gremlin> g.V('foxsports').
           out('hasUser').
           outE('hasInterest').
           group().
             by(inV().values('name')).
             by(values('count').sum()).
           unfold().
           project('interest-category','totalCount').
             by(keys).
             by(values)
==>[interest-category:MMA,totalCount:2]
==>[interest-category:Business & Finance,totalCount:1]
==>[interest-category:Soccer,totalCount:1]
==>[interest-category:Media,totalCount:3]
==>[interest-category:Basketball,totalCount:1]
==>[interest-category:Sports,totalCount:10]