请参见下面的代码。它工作得很好。但是,请想象有多达数百种不同的大小写ID(她唯一的caseId 1和2)。我无法为每个个案ID编写单独的查询。有没有一种方法可以简化它?我现在一直在搜索几天。
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:///running_to_csv.csv" AS row
WITH toInteger(row.case_id) AS cid, row
CREATE (event: Event {caseId: cid, activityName: row.activity, time: row.timestamp})
MATCH(event: Event)
WHERE event.caseId = 1
WITH event ORDER BY event.time ASC
WITH apoc.coll.frequencies(apoc.coll.pairsMin(COLLECT(event.activityName))) AS g
UNWIND g AS p
RETURN*
MATCH(event: Event)
WHERE event.caseId = 2
WITH event ORDER BY event.time ASC
WITH apoc.coll.frequencies(apoc.coll.pairsMin(COLLECT(event.activityName))) AS g
UNWIND g AS p
RETURN*
如果仅忽略了“ event.caseId = ...”行,则结果为假,因为该顺序是针对时间而不是caseId。先感谢您。
答案 0 :(得分:0)
似乎这样应该可以工作,只需排序,然后按caseId进行收集,然后在按caseId获得频率之后但在UNWIND之前按caseId进行另一次排序:
MATCH(event: Event)
WITH event.caseId as caseId, event
ORDER BY event.time ASC
WITH caseId, collect(event.activityName) as names
WITH caseId, apoc.coll.frequencies(apoc.coll.pairsMin(names)) AS g
ORDER BY caseId ASC
UNWIND g AS p
RETURN *