Cypher查询迭代特定类型的所有节点并对相关节点进行分组

时间:2017-04-04 15:07:21

标签: foreach neo4j cypher batch-processing

有人可以帮助neo4j新手(这是我的第二天!)解决以下问题吗?

我有一个数据库,我有所有玩过(S)游戏的玩家(P)和他们取得的分数。有很多Ss,我们称之为S1,S2,S3 ......等等。有很多P,P1,P2,P3等。

每个会话都有玩家,例如

(P) - [:出场] - 将(S)

每个会话都有可变数量的玩家,从2到10。

我想要做的是访问每个SESSION,让每个玩家参加该会话,然后根据分数对其进行排名。玩家需要首先按分数排序然后排名,每个玩家的得分高于前一个有BEAT关系的玩家。通常,我会使用FOREACH循环,但我无法弄清楚如何对Cypher做同样的事情。

例如,S1有玩家P1,P3和P5。如果P3得到100,P1得到70,而P5 30,我想建立以下关系:

(P3) - [:BEAT] - GT;(P1) - [:BEAT] - GT;(P5)

我需要为每个会话执行此操作。解决这个问题的最佳方法是什么?

此致

3 个答案:

答案 0 :(得分:3)

假设score存储在:PLAYED关系中,这应该有效:

// Find all players who have played in a session
MATCH (p:Player)-[r:PLAYED]->(s:Session)           
// for each Session, order the players by their score for that session
WITH s, p ORDER BY r.score DESC       
// for each session, group the players (now ordered by their scores)
WITH s, COLLECT(p) AS players   
// iterate over the sequence 0..number_of_players_in_this_session-2
UNWIND range(0,size(players)-2) AS i      
// grab pairs of players, starting from the highest scoring
WITH players[i] AS l1, players[i+1] AS l2      
// create :BEAT relationship
CREATE (l1)-[:BEAT]->(l2)

有一个简单的Neo4j控制台示例here

当然这里存在数据建模问题,您没有将:BEAT关系与特定会话相关联。

答案 1 :(得分:1)

只需添加快捷方法来保存几行Cypher,您就可以安装APOC Procedures并使用apoc.nodes.link()快速创建关系链。

以William Lyon的查询为基础:

// Find all players who have played in a session
MATCH (p:Player)-[r:PLAYED]->(s:Session)           
// for each Session, order the players by their score for that session
WITH s, p ORDER BY r.score DESC       
// for each session, group the players (now ordered by their scores)
WITH s, COLLECT(p) AS players  
CALL apoc.nodes.link(players, 'BEAT')
// can't end a query with CALL, so just do a dummy return of some kind
RETURN DISTINCT true

答案 2 :(得分:0)

如果有人发现这篇文章有用,我想补充一下,处理一个大的结果集(以避免内存不足),试试这个:

// Find all players who have played in a session
MATCH (p:Player)-[r:PLAYED]->(s:Session)           
// for each Session, order the players by their score for that session
WITH s, p ORDER BY r.score DESC
//Paginate/batch process results to avoid exhausting memory
SKIP 500000*n
LIMIT 500000          
// for each session, group the players (now ordered by their scores)
WITH s, COLLECT(p) AS players  
CALL apoc.nodes.link(players, 'BEAT')
// can't end a query with CALL, so just do a dummy return of some kind
RETURN DISTINCT true

即。线条

SKIP 500000*n
LIMIT 500000  

添加。设置n = 0,并保持增加直到不再更新记录。

感谢所有为此主题做出贡献的人。