Neo4j

时间:2017-07-05 10:34:43

标签: neo4j cypher graph-databases

我在尝试做一些非常容易的事情时遇到了很多问题。 我有这3个csv: 第一个,Cast CSV~150k记录

idActor,idMovie,nameActor,character
"10990","321612","Belle","Emma Watson"
"221018","321612","Beast","Dan Stevens"
"114019","321612","Gaston","Luke Evans"
"8945","321612","Maurice","Kevin Kline"

第二个,演员CSV~8k记录。

idActor,deathday,gender,birthday,name,place_of_birth,popularity
"10990","null","1","1990-04-15","Emma Watson","Paris; France","54.327581"
"114019","null","2","1979-04-15","Luke Evans","Pontypool; Wales; UK","13.145154"
"54415","null","2","1981-02-23","Josh Gad","Hollywood; Florida; USA","11.418704"
"2283","null","2","1960-11-11","Stanley Tucci","Peekskill; New York; USA","11.013948"
"221018","null","2","1982-10-10","Dan Stevens","Croydon; Surrey; England; UK","27.461241"
"3061","null","2","1971-03-31","Ewan McGregor","Perth; Scotland; UK","18.398385"

最后一个,电影CSV~10k记录。

(标题 - 1行)
    idMovie,概述,知名度,企业,     国家,RELEASE_DATE,收入,运行时间,标语,vote_average,     vote_count,预算,流派,标题

"99861","When Tony Stark tries to jumpstart a dormant peacekeeping program, things go awry and Earth’s Mightiest Heroes are put to the ultimate test as the fate of the planet hangs in the balance. As the villainous Ultron emerges, it is up to The Avengers to stop him from enacting his terrible plans, and soon uneasy alliances and unexpected action pave the way for an epic and unique global adventure.","10.836173","Marvel Studios;Prime Focus;Revolution Sun Studios","US","2015-04-22","1405035767","141","A New Age Has Come.","7.3","5868","280000000","Action;Adventure;Science Fiction","Avengers: Age of Ultron"

在Neo4j中导入后,将必要的字段转换为Integer或Float,我想做一些关系。

例如,

MATCH (m:Movie), (c:CastMember), (a:Actor)
WHERE m.idMovie = c.idMovie AND a.idActor = c.idActor
CREATE (a)-[:HAS_ACTED_IN{character:c.character}]->(m)

但它只创造了约50个关系,而不是超过100k。

所以,我做了其他尝试,特别是:

MATCH (m:Movie), (c:CastMember)
WHERE m.idMovie = c.idMovie
RETURN DISTINCT m.title

只返回约20场比赛。

再次,

MATCH (c:CastMember), (a:Actor)
WHERE c.idActor = a.idActor
RETURN a.name

只返回~5场比赛。

每次导入CSV时匹配都会更改。这很奇怪。

2 个答案:

答案 0 :(得分:0)

关于WHERE的{​​{3}}说:

  

WHEREMATCHOPTIONAL MATCH中的模式添加了约束   子句或过滤WITH子句的结果。

因此,每个Cypher查询可以使用多个WHERE。我相信这是你的情况。我会尝试类似的东西:

MATCH (c:CastMember)
MATCH (a:Actor)
WHERE a.idActor = c.idActor
MATCH (m:Movie)
WHERE m.idMovie = c.idMovie 
CREATE (a)-[:HAS_ACTED_IN{character:c.character}]->(m)

答案 1 :(得分:0)

所以,你应该:

CREATE CONSTRAINT ON (a:Actor) ASSERT a.idActor IS UNIQUE;
CREATE CONSTRAINT ON (m:Movie) ASSERT m.idMovie IS UNIQUE;

然后用LOAD CSV加载演员和电影..​​....之后你算了......那些号码正在结账?

然后您可以在演员阵容的LOAD CSV中创建关系目录:

LOAD CSV ... AS line
MATCH (m:Movie {idMovie : line.idMovie})
MATCH (a:Actor {idActor : line.idActor})
MERGE (a)-[:HAS_ACTED_IN {character: line.character}]->(m);

我还注意到以下几点:

  1. 强制转换csv上的标头顺序不正确
  2. WHERE m.id = c.idMovie和a.idActor = c.idActor无法正常工作 它应该是m.idMovie
  3. 但这些可能是错别字。

    希望这有帮助, 汤姆