通过改进图形结构或查询以下内容来改善性能的任何建议?理想情况下,我想将其转换为子1。现在最好的我可以得到它~8s,7M db命中超过大约2M节点和10M rels。
我有图形结构
(co:Company)<-[:HAS_PROVIDER]-(c:Customer)-[:HAS_CLAIM]->(c:Claim)
我希望能够参加这个节目:
为实现这一目标,我使用了两种方法:
a)创建了一个从Customer到Claim的关系,其中direct = true as:IS_DIRECT和direct = false as:IS_INDIRECT
b)将每个direct = true声明标记为:DirectClaim节点和direct = false声明为:InDirectClaim节点
使用(a)允许我通过过滤器对rels TYPE获取公司名称,客户数量,大小(IS_DIRECT)和大小(IS_INDIRECT)。但是,无论配置如何,使用提取,过滤的组合得到总和(数量),减少超时。
使用(b)工作但需要~10s
编辑:
查询(a)看起来像(对于@cybersam的帽子,这现在是~6s)
MATCH (co:Company)<-[:HAS_PROVIDER]-(c)-[r:IS_DIRECT|IS_INDIRECT]->(cl)
WITH distinct co, collect(r) as rels, count (distinct c) as cntc, collect(mc) as claims
WITH co, cntc,
size(filter(r in rels WHERE TYPE(r) = 'IS_DIRECT')) as dcls,
size(filter(r in rels WHERE TYPE(r) = 'IS_INDIRECT')) as indcls,
REDUCE
(s = {dclsamt: 0, indclsamt: 0}, x IN claims |
CASE WHEN x.direct
THEN {dclsamt: s.dclsamt + x.amount, indclsamt: s.indclsamt}
ELSE {dclsamt: s.dclsamt, indclsamt: s.indclsamt + x.amount}
END)
AS data
RETURN co.name as name,cntc, dcls,indcls, data
ORDER BY dcls desc
查询(b)看起来像:
MATCH (co:Company)<-[:HAS_PROVIDER]-(c)-[:HAS_CLAIM]->(cl)
WITH distinct co, count (distinct c) as cntc, COLLECT(cl) as cls
WITH co,cntc,
FILTER(n in cls WHERE 'DirectClaim' IN IN LABELS(n)) as dcls,
FILTER(n in cls WHERE 'InDirectClaim' IN LABELS(n)) as indcls
WITH co,cntc, size(dcls) as dclsct, size(indcls) as indclsct,
REDUCE(s = 0 , x IN dcls | s + x.amount) as dclsamt,
REDUCE(s = 0 , x IN indcls | s + x.amount) as indclsamt
RETURN co.name as name, cntc, dclsct, dclsamt, indclsct, indclsamt
答案 0 :(得分:2)
无需为数据模型添加额外数据(如冗余关系或标签)。
此查询显示了一种返回所需结果的方法(在返回的data
地图中,t
是true
金额的总和,而tc
是true
金额的计数;类似于f
和fc
):
MATCH (co:Company)<-[:HAS_PROVIDER]-(cu:Customer)-[:HAS_CLAIM]->(cl:Claim)
WITH co.name as comp_name, COUNT(DISTINCT cu) AS cust_count,
REDUCE(s = {t: 0, tc: 0, f: 0, fc: 0}, x IN COLLECT(cl) |
CASE WHEN x.direct
THEN {t: s.t + x.amount, tc: s.tc + 1, f: s.f, fc: s.fc}
ELSE {t: s.t, tc: s.tc, f: s.f + x.amount, fc: s.fc + 1}
END) AS data
RETURN comp_name, cust_count, data
答案 1 :(得分:1)
您当前的数据模型足以生成您想要的结果。
MATCH (co:Company)<-[:HAS_PROVIDER]-(cust:Customer)-[:HAS_CLAIM]->(claim:Claim)
WITH co, COUNT(DISTINCT cust) AS custs, COLLECT(DISTINCT claim) AS claims
WITH co,
custs,
[x IN claims WHERE x.direct|x.amount] AS direct_amts,
[x IN claims WHERE NOT x.direct|x.amount] AS indirect_amts
RETURN co,
custs,
SIZE(direct_amts) AS direct_count,
REDUCE(s=0, x IN direct_amts| s+x) AS direct_amt_total,
SIZE(indirect_amts) AS indirect_count,
REDUCE(s=0, x IN indirect_amts| s+x) AS indirect_amt_total
如果你真的需要速度,请确保你有一个:Claim(direct)
和:Claim(amount)
的索引,这真的会尖叫。或者,将布尔属性转换为第二个标签(即(claim:Claim:Direct)
),您可以为自己保存一个索引。
更新:根据您的引用,您唯一真正的改进途径将取决于您的使用情况。如果这张图是&#34;生活&#34;并且会不断更新,每当您添加,删除或更改:Customer
时,您始终可以在:Claim
节点上缓存计数和总金额。图表闪耀的地方,触及数据库子集的小频繁查询。因此,当您对声明执行任何操作时,请仅针对相应的:Customer
重新运行聚合,将结果存储在Customer
作为属性,然后对于您的大型报告,只需直接从中获取这些属性(数量远远少于:Customer
个节点。