是否可以在单个Cypher查询中使用多个级别的聚合?

时间:2019-01-17 22:50:26

标签: neo4j cypher

我编写了一个Cypher查询,以查找给定源和目标之间所有路径的逆节点度的乘积。我还希望能够包含如果反向节点度数全部加在一起,则每个路径代表的总数的百分比。但是,当我在WITH语句中创建此总和时(如下所示),返回的百分比始终为100。

MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-(n4:Disease)
WHERE n0.identifier = "DB01156"
AND n4.identifier = "DOID:0050742"
WITH
[
...
] AS degrees, path

// Adding a second with query allows us to access PDP for creating the PERCENT_OF_DWPC field in the return
WITH degrees, path, reduce(pdp = 1.0, d in degrees| pdp * d ^ -1) AS PDP
WITH path, PDP, sum(PDP) AS DWPC
RETURN
path,
PDP,
100 * (PDP / DWPC) AS PERCENT_OF_DWPC

ORDER BY PERCENT_OF_DWPC DESC

我知道可以通过有效地两次编写查询来实现此目的(如下所示),但是运行它所需的时间是原来的两倍。是否可以放弃额外的开销并在WITH子句中计算百分比?

MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-(n4:Disease)
WHERE n0.identifier = "DB01156"
AND n4.identifier = "DOID:0050742"
WITH
[
...
] AS degrees, path

WITH sum(reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.4)) AS DWPC

MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-(n4:Disease)
WHERE n0.identifier = "DB01156"
AND n4.identifier = "DOID:0050742"
WITH
[
...
] AS degrees, path, DWPC

WITH path, DWPC, reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.4) AS PDP
RETURN
path,
PDP,
100 * (PDP / DWPC) AS PERCENT_OF_DWPC

ORDER BY PERCENT_OF_DWPC DESC

1 个答案:

答案 0 :(得分:2)

您可以替换以下代码段:

WITH path, PDP, sum(PDP) AS DWPC
RETURN
  path,
  PDP,
  100 * (PDP / DWPC) AS PERCENT_OF_DWPC
ORDER BY PERCENT_OF_DWPC DESC

与此:

WITH path, collect(PDP) AS pdps, sum(PDP) AS DWPC
UNWIND pdps AS PDP
RETURN
  path,
  PDP,
  100 * (tofloat(PDP) / DWPC) AS PERCENT_OF_DWPC
ORDER BY PERCENT_OF_DWPC DESC

原始WITH子句具有2个“分组键”,pathPDP,因此sum() aggregating function将对不同的{{1} }和path对(换句话说,每个和只使用一个PDP),这不是您想要的。

新的PDP子句增加了对聚合函数WITH的使用,以收集所有collect()的值。由于此新子句现在仅具有一个非聚合术语PDP,因此该术语用作整个分组关键字(对于两个功能)。这导致path函数实际上对相同的sum()所有 PDP个值求和。然后,path子句用于分离出单独的UNWIND值。另外,我假设PDP是整数,因此PDP函数用于确保除法运算不会执行整数截断。

[更新]

如果您实际上需要在所有 tofloat()个值上计算总和DWPC,则新的代码段将更加复杂:

path