假设我有以下图表设置:
CREATE (john:Person {name: 'John Doe'}), (jane:Person {name: 'Jane Doe'}), (bob:Person {name: 'Bob Doe'})
CREATE (reading:Hobby {name: 'Reading'}), (sports:Hobby {name: 'Sports'}), (music:Hobby {name: 'Music'})
MERGE (john)-[:LIKES {intensity: 25}]->(reading)
MERGE (john)-[:LIKES {intensity: 70}]->(sports)
MERGE (john)-[:DISLIKES {intensity: 15}]->(music)
MERGE (jane)-[:LIKES {intensity: 50}]->(reading)
MERGE (jane)-[:DISLIKES {intensity: 40}]->(sports)
MERGE (jane)-[:LIKES {intensity: 20}]->(music)
MERGE (bob)-[:DISLIKES {intensity: 35}]->(reading)
MERGE (bob)-[:LIKES {intensity: 50}]->(sports)
MERGE (bob)-[:LIKES {intensity: 25}]->(music)
每个人可能会以某种任意强度喜欢或不喜欢给定的爱好。
计算每个人的共同激情" (相互喜欢或不喜欢)对于任何给定的爱好,我可以运行以下内容:
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b) AND TYPE(al) = TYPE(bl)
RETURN a.name, b.name, TYPE(al), h.name, (al.intensity + bl.intensity) / 2 AS passion
计算每个人&#34;鄙视&#34;对于一个给定的爱好,我可以运行逆向:
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b) AND TYPE(al) <> TYPE(bl)
RETURN a.name, b.name, h.name, (al.intensity + bl.intensity) / 2 AS disdain
这两项计算都以我期望的方式完全返回信息,但我在确定&#34;激情&#34;之间的区别时遇到了一些麻烦。并且&#34;不屑于&#34;在一个查询中计算最终的&#34;兼容性&#34;评级并按降序对结果进行排序。
我曾经尝试过这样的事情:
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b) AND TYPE(al) <> TYPE(bl)
WITH (al.intensity + bl.intensity) / 2 AS disdain
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b) AND TYPE(al) = TYPE(bl)
WITH a, b, h, disdain, (al.intensity + bl.intensity) / 2 AS passion
RETURN a.name, b.name, h.name, passion, disdain, (passion - disdain) AS compatibility
ORDER BY compatibility DESC
但由于我对Neo4j和Cypher查询缺乏经验,我最终得到的结果非常不正确。
我觉得我需要使用COLLECT和UNWIND的组合才能达到我想要的效果,但我不确定如何接近它,以及我是否在正确的轨道上。
作为旁注,我知道我可以通过将关系限制为LIKES并使用有符号整数来强度来实现更简单的结果(即:负LIKE可以表示DISLIKE),但我更愿意保留它们如果可能就分开。
有什么想法吗?
修改
使用stdob给我的答案,我能够投入一些汇总,最后我得到了以下内容:
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b)
WITH a, al, h, bl, b, (al.intensity + bl.intensity)/2 AS value
WITH a, al, h, bl, b, value,
CASE WHEN TYPE(al) = TYPE(bl) THEN value ELSE 0 END AS mutual,
CASE WHEN TYPE(al) <> TYPE(bl) THEN value ELSE 0 END AS separate
RETURN DISTINCT a.name, SUM(mutual) AS passion, SUM(separate) AS disdain, (SUM(mutual) - SUM(separate)) AS compatibility, b.name
ORDER BY compatibility DESC
输出更加清晰,正是我所希望的:
NAME A PASSION DISDAIN COMPATIBILITY NAME B
"John Doe" 60 50 10 "Bob Doe"
"John Doe" 37 72 -35 "Jane Doe"
"Jane Doe" 22 87 -65 "Bob Doe"
答案 0 :(得分:2)
我认为你需要这样的东西:
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b)
WITH a, al, h, bl, b, (al.intensity + bl.intensity)/2 AS value
WITH a, al, h, bl, b, value,
CASE WHEN TYPE(al) = TYPE(bl) THEN value ELSE 0 END AS passion,
CASE WHEN TYPE(al) <> TYPE(bl) THEN value ELSE 0 END AS disdain
RETURN a.name, b.name, h.name,
passion, disdain,
ABS(passion - disdain)/2.0 AS compatibility
ORDER BY compatibility DESC
答案 1 :(得分:1)
您可以使用UNION
来合并两个查询的结果:
WHERE ID(a) < ID(b) AND TYPE(al) = TYPE(bl)
RETURN a.name, b.name, "passion" AS intent, h.name, (al.intensity + bl.intensity) / 2 AS metric
UNION
MATCH (a:Person)-[al]->(h:Hobby)<-[bl]-(b:Person)
WHERE ID(a) < ID(b) AND TYPE(al) <> TYPE(bl)
RETURN a.name, b.name, "disdain" AS intent, h.name, (al.intensity + bl.intensity) / 2 AS metric
答案 2 :(得分:1)
这是我的密码会话和你提出的问题的解决方案。
我的方法假设缺乏LIKE和DISLIKE关系表示对该Hobby的强度为零。我也使DISLIKE强度为负。
注意:它使用APOC功能,因此您需要安装它。
见这里:https://github.com/neo4j-contrib/neo4j-apoc-procedures
neo4j> // Step 1: Get a resultset of hobbies that we care about
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
RETURN hobby;
+-----------+
| hobby |
+-----------+
| "Music" |
| "Reading" |
| "Sports" |
+-----------+
neo4j> // Step 2: Convert rows of hobbies into a collection of hobbies (row2col)
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
RETURN hobbies;
+--------------------------------+
| hobbies |
+--------------------------------+
| ["Music", "Reading", "Sports"] |
+--------------------------------+
neo4j> // Step 3: With hobbies as "global" state, match with every :Person node
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
RETURN hobbies, person;
+---------------------------------------------------------------+
| hobbies | person |
+---------------------------------------------------------------+
| ["Music", "Reading", "Sports"] | (:Person {name: "John Doe"}) |
| ["Music", "Reading", "Sports"] | (:Person {name: "Jane Doe"}) |
| ["Music", "Reading", "Sports"] | (:Person {name: "Bob Doe"}) |
+---------------------------------------------------------------+
neo4j> // Step 4: Gather likes and dislikes into maps
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
OPTIONAL
MATCH (person)-[LIKES:LIKES]->(h:Hobby)
WITH hobbies, person, apoc.map.fromLists(COLLECT(h.name), COLLECT(LIKES.intensity)) AS likes
OPTIONAL
MATCH (person)-[DISLIKES:DISLIKES]->(h:Hobby)
RETURN hobbies, person, likes,
apoc.map.fromLists(COLLECT(h.name), COLLECT(DISLIKES.intensity)) AS dislikes;
+-----------------------------------------------------------------------------------------------------------+
| hobbies | person | likes | dislikes |
+-----------------------------------------------------------------------------------------------------------+
| ["Music", "Reading", "Sports"] | (:Person {name: "Jane Doe"}) | {Music: 20, Reading: 50} | {Sports: 40} |
| ["Music", "Reading", "Sports"] | (:Person {name: "John Doe"}) | {Reading: 25, Sports: 70} | {Music: 15} |
| ["Music", "Reading", "Sports"] | (:Person {name: "Bob Doe"}) | {Music: 25, Sports: 50} | {Reading: 35} |
+-----------------------------------------------------------------------------------------------------------+
neo4j> // Step 5: Turn maps into collections (vectors), using hobbies list
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
OPTIONAL
MATCH (person)-[LIKES:LIKES]->(h:Hobby)
WITH hobbies, person, apoc.map.fromLists(COLLECT(h.name), COLLECT(LIKES.intensity)) AS likes
OPTIONAL
MATCH (person)-[DISLIKES:DISLIKES]->(h:Hobby)
WITH hobbies, person, likes,
apoc.map.fromLists(COLLECT(h.name), COLLECT(DISLIKES.intensity)) AS dislikes
RETURN person,
[x IN hobbies | COALESCE(likes[x], 0)] AS likes,
[x IN hobbies | COALESCE(-dislikes[x], 0)] AS dislikes;
+----------------------------------------------------------+
| person | likes | dislikes |
+----------------------------------------------------------+
| (:Person {name: "Jane Doe"}) | [20, 50, 0] | [0, 0, -40] |
| (:Person {name: "John Doe"}) | [0, 25, 70] | [-15, 0, 0] |
| (:Person {name: "Bob Doe"}) | [25, 0, 50] | [0, -35, 0] |
+----------------------------------------------------------+
neo4j> // Step 6: Map each person against each other
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
OPTIONAL
MATCH (person)-[LIKES:LIKES]->(h:Hobby)
WITH hobbies, person, apoc.map.fromLists(COLLECT(h.name), COLLECT(LIKES.intensity)) AS likes
OPTIONAL
MATCH (person)-[DISLIKES:DISLIKES]->(h:Hobby)
WITH hobbies, person, likes,
apoc.map.fromLists(COLLECT(h.name), COLLECT(DISLIKES.intensity)) AS dislikes
WITH person,
[x IN hobbies | COALESCE(likes[x], 0)] AS likes,
[x IN hobbies | COALESCE(-dislikes[x], 0)] AS dislikes
WITH COLLECT({person:person, likes:likes, dislikes:dislikes}) AS rows
UNWIND rows AS left
UNWIND rows AS right
WITH left, right
WHERE ID(left.person) < ID(right.person)
RETURN left, right;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| left | right |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {person: (:Person {name: "Jane Doe"}), dislikes: [0, 0, -40], likes: [20, 50, 0]} | {person: (:Person {name: "Bob Doe"}), dislikes: [0, -35, 0], likes: [25, 0, 50]} |
| {person: (:Person {name: "John Doe"}), dislikes: [-15, 0, 0], likes: [0, 25, 70]} | {person: (:Person {name: "Jane Doe"}), dislikes: [0, 0, -40], likes: [20, 50, 0]} |
| {person: (:Person {name: "John Doe"}), dislikes: [-15, 0, 0], likes: [0, 25, 70]} | {person: (:Person {name: "Bob Doe"}), dislikes: [0, -35, 0], likes: [25, 0, 50]} |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
neo4j> // Step 7: Calculate simple averages
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
OPTIONAL
MATCH (person)-[LIKES:LIKES]->(h:Hobby)
WITH hobbies, person, apoc.map.fromLists(COLLECT(h.name), COLLECT(LIKES.intensity)) AS likes
OPTIONAL
MATCH (person)-[DISLIKES:DISLIKES]->(h:Hobby)
WITH hobbies, person, likes,
apoc.map.fromLists(COLLECT(h.name), COLLECT(DISLIKES.intensity)) AS dislikes
WITH person,
[x IN hobbies | COALESCE(likes[x], 0)] AS likes,
[x IN hobbies | COALESCE(-dislikes[x], 0)] AS dislikes
WITH COLLECT({person: person, likes:likes, dislikes:dislikes}) AS coll
UNWIND coll AS left
UNWIND coll AS right
WITH left, right
WHERE ID(left.person) < ID(right.person)
RETURN left.person.name,
right.person.name,
left.likes,
right.likes,
EXTRACT(x IN apoc.coll.zip(left.likes, right.likes) | (x[0] + x[1]) / 2) AS avg_like,
left.dislikes,
right.dislikes,
EXTRACT(x IN apoc.coll.zip(left.dislikes, right.dislikes) | (x[0] + x[1]) / 2) AS avg_dislike;
+----------------------------------------------------------------------------------------------------------------------------------+
| left.person.name | right.person.name | left.likes | right.likes | avg_like | left.dislikes | right.dislikes | avg_dislike |
+----------------------------------------------------------------------------------------------------------------------------------+
| "Jane Doe" | "Bob Doe" | [20, 50, 0] | [25, 0, 50] | [22, 25, 25] | [0, 0, -40] | [0, -35, 0] | [0, -17, -20] |
| "John Doe" | "Jane Doe" | [0, 25, 70] | [20, 50, 0] | [10, 37, 35] | [-15, 0, 0] | [0, 0, -40] | [-7, 0, -20] |
| "John Doe" | "Bob Doe" | [0, 25, 70] | [25, 0, 50] | [12, 12, 60] | [-15, 0, 0] | [0, -35, 0] | [-7, -17, 0] |
+----------------------------------------------------------------------------------------------------------------------------------+
neo4j> // Step 8: Try apoc.algo.euclideanSimilarity()
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
OPTIONAL
MATCH (person)-[LIKES:LIKES]->(h:Hobby)
WITH hobbies, person, apoc.map.fromLists(COLLECT(h.name), COLLECT(LIKES.intensity)) AS likes
OPTIONAL
MATCH (person)-[DISLIKES:DISLIKES]->(h:Hobby)
WITH hobbies, person, likes,
apoc.map.fromLists(COLLECT(h.name), COLLECT(DISLIKES.intensity)) AS dislikes
WITH person,
[x IN hobbies | COALESCE(likes[x], 0)] AS likes,
[x IN hobbies | COALESCE(-dislikes[x], 0)] AS dislikes
WITH COLLECT({person: person, likes:likes, dislikes:dislikes}) AS coll
UNWIND coll AS left
UNWIND coll AS right
WITH left, right
WHERE ID(left.person) < ID(right.person)
RETURN left.person.name,
right.person.name,
EXTRACT(x IN apoc.coll.zip(left.likes, right.likes) | (x[0] + x[1]) / 2) AS avg_like,
EXTRACT(x IN apoc.coll.zip(left.dislikes, right.dislikes) | (x[0] + x[1]) / 2) AS avg_dislike,
apoc.algo.euclideanSimilarity(left.likes, right.likes) AS euclidean_like,
apoc.algo.euclideanSimilarity(left.dislikes, right.dislikes) AS euclidean_dislike;
+-------------------------------------------------------------------------------------------------------------------+
| left.person.name | right.person.name | avg_like | avg_dislike | euclidean_like | euclidean_dislike |
+-------------------------------------------------------------------------------------------------------------------+
| "John Doe" | "Jane Doe" | [10, 37, 35] | [-7, 0, -20] | 0.012824784198464426 | 0.02287281728431341 |
| "John Doe" | "Bob Doe" | [12, 12, 60] | [-7, -17, 0] | 0.024026799286343117 | 0.025589279178274353 |
| "Jane Doe" | "Bob Doe" | [22, 25, 25] | [0, -17, -20] | 0.013910675635706434 | 0.018466972048042936 |
+-------------------------------------------------------------------------------------------------------------------+
neo4j> // Step 9: Save our similarity calculations (yay, new relationships!)
MATCH (h:Hobby)
WITH h.name AS hobby
ORDER BY hobby
WITH COLLECT(hobby) AS hobbies
MATCH (person:Person)
OPTIONAL
MATCH (person)-[LIKES:LIKES]->(h:Hobby)
WITH hobbies, person, apoc.map.fromLists(COLLECT(h.name), COLLECT(LIKES.intensity)) AS likes
OPTIONAL
MATCH (person)-[DISLIKES:DISLIKES]->(h:Hobby)
WITH hobbies, person, likes,
apoc.map.fromLists(COLLECT(h.name), COLLECT(DISLIKES.intensity)) AS dislikes
WITH person,
[x IN hobbies | COALESCE(likes[x], 0)] AS likes,
[x IN hobbies | COALESCE(-dislikes[x], 0)] AS dislikes
WITH COLLECT({person: person, likes:likes, dislikes:dislikes}) AS coll
UNWIND coll AS left
UNWIND coll AS right
WITH left, right
WHERE ID(left.person) < ID(right.person)
WITH left.person AS person,
right.person AS other,
EXTRACT(x IN apoc.coll.zip(left.likes, right.likes) | (x[0] + x[1]) / 2) AS avg_like,
EXTRACT(x IN apoc.coll.zip(left.dislikes, right.dislikes) | (x[0] + x[1]) / 2) AS avg_dislike,
apoc.algo.euclideanSimilarity(left.likes, right.likes) AS euclidean_like,
apoc.algo.euclideanSimilarity(left.dislikes, right.dislikes) AS euclidean_dislike
MERGE (person)-[LIKE:LIKE_SIMILARITY]->(other)
SET LIKE.euclidean = euclidean_like,
LIKE.avg = avg_like
MERGE (person)-[DISLIKE:DISLIKE_SIMILARITY]->(other)
SET DISLIKE.euclidean = euclidean_dislike,
DISLIKE.avg = avg_dislike
RETURN person.name, other.name, LIKE, DISLIKE;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| person.name | other.name | LIKE | DISLIKE |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Jane Doe" | "Bob Doe" | [:LIKE_SIMILARITY {euclidean: 0.013910675635706434, avg: [22, 25, 25]}] | [:DISLIKE_SIMILARITY {euclidean: 0.018466972048042936, avg: [0, -17, -20]}] |
| "John Doe" | "Jane Doe" | [:LIKE_SIMILARITY {euclidean: 0.012824784198464426, avg: [10, 37, 35]}] | [:DISLIKE_SIMILARITY {euclidean: 0.02287281728431341, avg: [-7, 0, -20]}] |
| "John Doe" | "Bob Doe" | [:LIKE_SIMILARITY {euclidean: 0.024026799286343117, avg: [12, 12, 60]}] | [:DISLIKE_SIMILARITY {euclidean: 0.025589279178274353, avg: [-7, -17, 0]}] |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
注意:我不确定这是否是针对您的用例的良好相似性度量,但这至少证明了使用cypher + apoc可能进行的一些数据转换。