使用WITH,WHERE和UNWIND添加多个关系

时间:2019-02-21 08:03:25

标签: neo4j cypher

我具有以下结构的数据:

{"id": "1", "name": "A. I. Lazarev", "org": "United States Department of State", "tags": [{"t": "Infrared"}, {"t": "Near-infrared spectroscopy"}, {"t": "Infrared astronomy"}, {"t": "Data collection"}], "pubs": [{"i": "1542417502", "r": 6}], }
{"id": "2", "name": "Stevan Spremo", "tags": [{"t": "Micro-g environment"}, {"t": "Antibiotics"}, {"t": "Bacteriology"}], "pubs": [{"i": "222163962", "r": 0}], }
{"id": "3", "name": "Bricchi G", "pubs": [{"i": "2417067698", "r": 1}, {"i": "2406980973", "r": 1}]}

某些行具有标签,某些行具有组织,有些行同时具有,有些行则没有。

我想添加(1)作者和标签,(2)作者和组织以及(3)作者和出版物之间的关系。我已经将出版物作为节点,因此一旦获得(1)和(2),就应该很容易获得(3)。

我一直在尝试使用以下代码:

CALL apoc.periodic.iterate(
"CALL apoc.load.json('file:/test.txt') YIELD value AS q RETURN q",
"UNWIND q.id as id
CREATE (a:Author {id:id, name:q.name, citations:q.n_citation, publications:q.n_pubs})
WITH q, a
UNWIND q.tags as tags
MERGE (t:Tag {{name: tags.t}})
CREATE (a)-[:HAS_TAGS]->(t)
WITH q, a
WHERE q.org is not null
MERGE (o:Organization {name: q.org})
CREATE (a)-[:AFFILIATED_WITH]->(o)",
{batchSize:10000, iterateList:true, parallel:false})

标签和组织在数据中显示多次,但每个标签和组织只能有一个节点,因此我使用MERGE为它们创建了唯一的节点。

以下代码的问题在于,它创建重复的AFFILIATED_WITH关系-实际上创建的AFFILIATED_WITH关系数量与标签数量相同。

如何更改密码查询,以免创建重复的关系?

1 个答案:

答案 0 :(得分:3)

此子句之后:

UNWIND q.tags as tags

您的查询将具有与当前q的标签数一样多的数据行(每行将具有q, a, id, tags值)。每个数据行将执行一次后续操作。这就是为什么您创建太多AFFILIATED_WITH关系的原因。

要解决您的问题,您必须在适当的时候适当减少数据行的数量(这也将加快处理速度,因为可以避免不必要的重复操作)。就您而言,您只需将第二个WITH q, a子句更改为WITH DISTINCT q, a

CALL apoc.periodic.iterate(
  "CALL apoc.load.json('file:///test.txt') YIELD value AS q RETURN q",
  "CREATE (a:Author {id:q.id, name:q.name, citations:q.n_citation, publications:q.n_pubs})
   WITH q, a
   UNWIND q.tags as tags
   MERGE (t:Tag {name: tags.t})
   CREATE (a)-[:HAS_TAGS]->(t)
   WITH DISTINCT q, a
   WHERE q.org is not null
   MERGE (o:Organization {name: q.org})
   CREATE (a)-[:AFFILIATED_WITH]->(o)",
  {batchSize:10000, iterateList:true, parallel:false}
)

我还通过删除不必要的UNWIND q.id as id子句简化了查询,并修复了一些语法问题。

[已更新]

如果要添加AUTHORED关系(按照此答案的注释中的要求),则应在创建AFFILIATED_WITH关系之前 WHERE q.org is not null子句将过滤掉一些q节点。另外,每当您使用CREATE创建关系时,Cypher都要求您为该关系指定方向

CALL apoc.periodic.iterate(
  "CALL apoc.load.json('file:///test.txt') YIELD value AS q RETURN q",
  "CREATE (a:Author {id:q.id, name:q.name, citations:q.n_citation, publications:q.n_pubs})
   WITH q, a
   UNWIND q.tags as tags
   MERGE (t:Tag {name: tags.t})
   CREATE (a)-[:HAS_TAGS]->(t)
   WITH DISTINCT q, a
   UNWIND q.pubs as pubs
   MERGE (p:Quanta {id: pubs.i})
   CREATE (a)-[r:AUTHORED {rank: pubs.r}]->(p)
   WITH q, a
   WHERE q.org is not null
   MERGE (o:Organization {name: q.org})
   CREATE (a)-[:AFFILIATED_WITH]->(o)",
  {batchSize:10000, iterateList:true, parallel:false}
)