Neo4j,如何从diffreent标签组中返回前n个?

时间:2017-09-26 01:21:40

标签: neo4j cypher top-n

我有一个代表"帖子"的数据库,它包含创建日期,标签,标题。一些帖子是其他帖子的答案,然后一些帖子是问题,其他帖子是答案。这里的任务是找到每个不同标签组中具有最短答案时间的问题。我是如何实现这一目标的?

我已经找到了所有已回答的问题以及他们在问题之间的时差。创建时间及其第一个答案的每个标签的创建时间,但我不能以最短的答案时间返回每个标签的问题(每个标签组中的前1个)。我只能归还所有东西。

任何人都可以帮我解决这个问题吗?

这是我的疑问:

WITH ['geospatial', 'economics', 'usa', 'demographics'] AS topiclist
UNWIND topiclist AS topics
Match (p1:Posts)
UNWIND p1.Tags AS tags
WITH p1,trim(tags) AS tag
Where tag = topics
Match (p1)-[:PARENT_OF]->(p2:Posts)
WITH p1, p2.CreationDate - p1.CreationDate AS time,tag,p2
ORDER BY tag,time
Return p1.Title ,time,tag

感谢Michael Hunger的回答,这解决了我的问题!!如果您有类似的问题,请检查他的答案。

输出样本:

returned result, the 3 colums represent : questions, time difference, tags

示例数据:

All data will be used looks like this image, the post with a parentId is an answer, its parent is its belonged question

1 个答案:

答案 0 :(得分:0)

如果在存储数据时进行修剪,那将是一件好事。

WITH ['geospatial', 'economics', 'usa', 'demographics'] AS topiclist
UNWIND topiclist AS topics

Match (p1:Posts) where single(tag in p1.tags where trim(tag) = topic)
Match (p1)-[:PARENT_OF]->(p2:Posts)
WITH p1, p2.CreationDate - p1.CreationDate AS time,tag,p2
ORDER BY tag,time
with tag, head(collect({post:p1,time:time}) as first
Return first.post.Title as title,first.time as time,tag