我正在Paper
类型的一组节点上运行PageRank,其中每个节点都有一个属性year
。我目前正在使用当年所有论文的PageRank分数的平均和标准偏差,按年份对每个PageRank分数进行归一化。
我想返回每年的前100篇论文(基于缩放后的PageRank值)。我可以在一个查询中做到吗?
下面的查询计算标定分数并返回总体前100个结果,而不是每年的前100个结果:
CALL algo.pageRank.stream(
'MATCH (p:Paper) WHERE p.year < 2015 RETURN id(p) as id',
'MATCH (p1:Paper)-[:CITES]->(p2:Paper) RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:20, write:false, concurrency:20})
YIELD node, score
WITH
node.title AS title,
node.year AS year,
score AS page_rank
ORDER BY page_rank DESC
LIMIT 100
WITH year, COLLECT({title: title, page_rank: page_rank}) AS data, AVG(page_rank) AS avg_page_rank, stDev(page_rank) as stdDev
UNWIND data AS d
RETURN year, d.title AS title, ABS(d.page_rank-avg_page_rank)/stdDev AS scaled_score;
任何建议将不胜感激!
答案 0 :(得分:2)
尝试一下:
CALL algo.pageRank.stream(
'MATCH (p:Paper) WHERE p.year < 2015 RETURN id(p) as id',
'MATCH (p1:Paper)-[:CITES]->(p2:Paper) RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:20, write:false, concurrency:20})
YIELD node, score
WITH
node.title AS title,
node.year AS year,
score AS page_rank
ORDER BY page_rank DESC
WITH year, COLLECT({title: title, page_rank: page_rank})[..100] AS data, AVG(page_rank) AS avg_page_rank, stDev(page_rank) as stdDev
UNWIND data AS d
RETURN year, d.title AS title, ABS(d.page_rank-avg_page_rank)/stdDev AS scaled_score;
此查询会删除LIMIT
子句,而是每年保留前100个(排序的)data
项目。