我很难过,但我确信我错过了一些明显的东西。简而言之,我不知道为什么我的查询返回2倍我期望的值。
下面的屏幕截图显示了查询,它下面的查询结果,以及一些准确的基线数据(底部结果集)。
例如,团队B
确实参与了4场比赛,得到10分,但顶部的第二个查询返回2次。
要跟进,这是我的玩具数据库:
// add constraint
CREATE CONSTRAINT ON (n:Team) ASSERT n.id IS UNIQUE;
// load the teams
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1QwXJE2qWVsejWeJGOouYxblNTox_Z9Ly5TWggzQNQVY/pub?gid=0&single=true&output=csv" AS row
WITH row
MERGE (t:Team {id:row.id, name:row.name});
// load the games
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1QwXJE2qWVsejWeJGOouYxblNTox_Z9Ly5TWggzQNQVY/pub?gid=33501648&single=true&output=csv" AS row
WITH row
CREATE (g:Game)
MERGE (h:Team {id:row.hometeam})
MERGE (a:Team {id:row.awayteam})
MERGE (a)-[:AWAY_TEAM {score:row.awayscore}]->(g)
MERGE (h)-[:HOME_TEAM {score:row.homescore}]->(g);
// Games played
MATCH (t:Team)-[r]->(x:Game)
RETURN t.name, count(x) as games, sum(r.score) as for
ORDER BY games DESC
// the query in question which 2x's the results
MATCH (t1:Team)-[r1]->(g1:Game)
MATCH (g1)<-[r2]-(t2:Team)
RETURN t1.name, count(r1) as games, sum(r1.score) as for, sum(r2.score) as against
ORDER BY games DESC
以下是整个图表的视图,其中边缘的数字是团队在特定游戏中的分数
答案 0 :(得分:3)
查询的问题在于,每个队伍在每场比赛中都会匹配两次。在这两个MATCH
语句中:
MATCH (t1:Team)-[r1]->(g1:Game)
MATCH (g1)<-[r2]-(t2:Team)
每个团队都绑定到t1和t2。
要解决此问题,只需添加WHERE NOT t1=t2
:
MATCH (t1:Team)-[r1]->(g1:Game)
MATCH (g1)<-[r2]-(t2:Team) WHERE NOT t1=t2
RETURN t1.name, count(r1) as games, sum(r1.score) as for, sum(r2.score) as against
ORDER BY games DESC
此外,在您的LOAD CSV导入语句中,您应该使用toInt
函数确保正确地转换整数(例如得分):
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1QwXJE2qWVsejWeJGOouYxblNTox_Z9Ly5TWggzQNQVY/pub?gid=33501648&single=true&output=csv" AS row
WITH row
CREATE (g:Game)
MERGE (h:Team {id:row.hometeam})
MERGE (a:Team {id:row.awayteam})
MERGE (a)-[:AWAY_TEAM {score:toInt(row.awayscore)}]->(g)
MERGE (h)-[:HOME_TEAM {score:toInt(row.homescore)}]->(g);