我正在研究一个返回"组合限制的Cypher"在两组结果中,一组是直接邻居,另一组是邻居交叉"事件节点",如下:
OPTIONAL MATCH (subject:Person {age:"38"})--(event:Event)--(targetViaEvent)
OPTIONAL MATCH (subject)--(directTarget)
WHERE NOT directTarget:Event
WITH subject, targetViaEvent, directTarget,
COUNT(event) AS eventCount
ORDER BY eventCount DESC
WITH subject, COLLECT(directTarget) + COLLECT(targetViaEvent) as targetList
UNWIND targetList AS target
WITH DISTINCT subject, target
SKIP 0 LIMIT 10
...
此Cypher查询的主要目的是:
Event
,请找到该事件的其他邻居Event
,都使用skip和limit进行分页
4.1。如果能够,则返回标有Event
标签的邻居,而不是其他规格:
使用COLLECT()
时,执行时间变得非常慢,使得neo4j shell失速,因为每个主题可能有一万个directTarget
和targetViaEvent
。我怀疑COLLECT()
缓存内存中每个匹配的节点对象,因此在此数据范围内阻塞了Neo4j。我的目的只是将两者结合起来,并完全限制。是否有任何技巧可以改善我的Cypher?
编辑:
正如@InverseFalcon在上面的Cypher中指出我的错误,这里是我的整个Cypher的更新:
PROFILE MATCH (subject:Person {age:"38"})
OPTIONAL MATCH (subject)--(directTarget)
WHERE NOT directTarget:Event
OPTIONAL MATCH (subject)--(event:Event)--(targetViaEvent)
WITH subject, targetViaEvent, directTarget,
COUNT(event) AS eventCount ORDER BY eventCount DESC
WITH subject, COLLECT(directTarget) + COLLECT(targetViaEvent) as targetList
UNWIND targetList AS target
WITH DISTINCT subject, target SKIP 0 LIMIT 300 WHERE target IS NOT NULL
OPTIONAL MATCH (subject)-[subject_target]-(target)
OPTIONAL MATCH (subject)--(eventPrime)--(target)
WITH subject, subject_target, target, COLLECT(eventPrime)[0..200] AS eventList
UNWIND (CASE eventList WHEN [] THEN [null] else eventList end) as limitedEvents
OPTIONAL MATCH (subject)-[subject_event]-(limitedEvents)-[event_target]-(target)
RETURN subject, subject_target, target, subject_event, limitedEvents, event_target
注意:在SKIP...LIMIT...
之后我重复查询只是为了识别节点之间的关系,在某种意义上 a)我想在json中建立关系结果; b)经过多次尝试,我无法设法获取前3个MATCH
的关系,特别是COUNT(event)
不起作用,因为每个事件与一个关系出价,以便计数不断为1.
答案 0 :(得分:2)
我们可以稍微改进您的查询,因为现在您正在使用每个directTarget在笛卡尔积中为每个事件+ targetViaEvent构建行,因此您需要做大量的工作而不需要做。一个好的方法,特别是对于你想要两者聚合的背靠背MATCH或可选匹配,是在每个聚合上单独构建聚合,而不是一次尝试全部聚合。这避免了笛卡尔积。
我建议将其作为替代查询:
MATCH (subject:Person {age:"38"})
OPTIONAL MATCH (subject)--(event:Event)--(targetViaEvent)
WITH subject, COUNT(event) AS eventCount, targetViaEvent
ORDER BY eventCount DESC
WITH subject, COLLECT(targetViaEvent) as eventTargets
// Above WITH means we now have only one row per subject so far
OPTIONAL MATCH (subject)--(directTarget)
WHERE NOT directTarget:Event
WITH subject, COLLECT(directTarget) + eventTargets as targetList
UNWIND targetList AS target
WITH DISTINCT subject, target SKIP 0 LIMIT 10
...
修改
我刚刚发现原始查询中存在问题。在你的两个OPTIONAL MATCH中,你正在分享'subject'变量。这使得你的第二个可选比赛依赖于你的第一个可选比赛中的subjects
。它不会寻找那种模式:与你的第一个OPTIONAL MATCH不匹配的人。
基本上,如果第一个OPTIONAL MATCH是MATCH,则该组OPTIONAL MATCHES实际应该执行相同的。
如果你的目的是在所有人身上同时运行两个OPTIONAL MATCH,那么你可能需要将查询的第一部分更改为:
MATCH (subject:Person {age:"38"})
OPTIONAL MATCH (subject)--(event:Event)--(targetViaEvent)
OPTIONAL MATCH (subject)--(directTarget)
...
这可能会影响原始查询的速度和构建的结果数量。
此外,我们的查询(在您更改之后)的结果也将返回没有目标的主题行,其中两个可选匹配与主题的任何内容都不匹配(在这些情况下,具有空目标的单个主题) )。如果在回报中不需要这些,我们都需要在最后的WITH之后添加WHERE target IS NOT NULL
。