由于我是neo4j的新手,我目前正在尝试neo4j电影数据库样本。
我想知道比较子图和关系的最佳方法是什么,例如,如何让所有电影都有相同的人员。
基于stackoverflow上的其他问题,我让它回归所有特定演员一起演出的电影:
WITH ['Tom Hanks', 'Meg Ryan'] as names
MATCH (p:Person)
WHERE p.name in names
WITH collect(p) as persons
WITH head(persons) as head, tail(persons) as persons
MATCH (head)-[:ACTED_IN]->(m:Movie)
WHERE ALL(p in persons WHERE (p)-[:ACTED_IN]->(m))
RETURN m.title
但是,如何在不指定演员姓名的情况下检索具有相同演员的电影?
答案 0 :(得分:2)
此查询应该有效:
// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// match the second movie and all its actors
match (m2:Movie)<-[:ACTED_IN]-(a2:Person)
// avoid match the same movie with where id(m1) > id(m2)
where id(m1) > id(m2)
// order actors of m2 by name
with m1, m2, actors1, a2 order by a2.name
// store ordered actors of m2 into actors2 variable
// pass to the next context only when the ordered arrays (actors1 and actors2) are equals
with m1, m2, actors1, collect(a2) actors2 where actors1 = actors2
// return movies that have the same actors
return m1, m2
使用电影数据库(:play movie graph
),此查询产生了此输出:
╒══════════════════════════════════════════════════════════════════════╤══════════════════════════════════════════════════════════════════════╕
│"m1" │"m2" │
╞══════════════════════════════════════════════════════════════════════╪══════════════════════════════════════════════════════════════════════╡
│{"title":"The Matrix Revolutions","tagline":"Everything that has a beg│{"title":"The Matrix Reloaded","tagline":"Free your mind","released":2│
│inning has an end","released":2003} │003} │
└──────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────┘
答案 1 :(得分:2)
一些可能更有效的替代方法(使用PROFILE检查):
只有电影与演员匹配一次,然后收集它们并将它们放在你需要生成交叉产品的次数,然后过滤掉并进行比较。这样您就不必多次点击数据库,因为您只需要从第一次匹配中获得的数据。我将借用布鲁诺的查询并稍微调整一下。
// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// collect this data into a single collection
with collect({m:m1, actors:actors1}) as data
// generate cross product of the data
unwind data as d1
unwind data as d2
with d1, d2
// prevent comparison against the same movie, or the same pairs in different orders
where id(d1.m) < id(d2.m) and d1.actors = d2.actors
// return movies that have the same actors
return d1.m, d2.m
或者,您可以按演员分组电影,只返回相应分组的电影:
// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// group movies with their sets of actors
with collect(m1) as movies, actors1
// only interested in where multiple movies have the same actor sets
where size(movies) > 1
// return the collection of movies with the same actors
return movies
第二个查询在这里可能更好,因为你得到的所有电影都是相同的演员,而不是每行都有对。