将大的csv文件部分导入Neo4j

时间:2018-09-12 22:40:07

标签: csv neo4j cypher

一个很大的csv文件(大约7 GB),这是我使用3种不同方法在对象1和对象2的行之间进行成对计算的结果,如下所示:

obj 1 , obj 2 , method1 , method2 , method3

obj1obj2是字符串,method1,method2,method3是浮点值。

我不想将整个csv导入Neo4j,但是当method1的值高于某个阈值时,我希望导入特定行,并且要在该行的object1和object2之间定义要定义的边,并且对于method2和method3,我的意思是,当method2的值高于某个阈值时,我希望导入特定行,并在该行的object1和object2之间定义一条边。感谢@cybersam和@Dave Bennett,他们在这里给我写了原始查询,我做了些改动,分别运行了以下3个查询。之后,当我编写一个简单的查询,例如:

   match (n)-[r:similar_on_method2]-(m) return n,r,m 

我不仅得到要求的关系,而且结果图中还包含其他关系,我不知道这是怎么回事?!

 Using periodic commit
 LOAD CSV WITH HEADERS
 FROM 'file:///objects.csv'
 AS line
 WITH line
 WHERE toFloat(line.method1) >= $x
 MERGE (obj1:Object {name: line.obj1})
 MERGE (obj2:Object {name: line.obj2})
 MERGE (obj1)-[:similar_on_method1]->(obj2)


 Using periodic commit
 LOAD CSV WITH HEADERS
 FROM 'file:///objects.csv'
 AS line
 WITH line
 WHERE toFloat(line.method2) >= $x
 MERGE (obj1:Object {name: line.obj1})
 MERGE (obj2:Object {name: line.obj2})
 MERGE (obj1)-[:similar_on_method2]->(obj2)


 Using periodic commit
 LOAD CSV WITH HEADERS
 FROM 'file:///objects.csv'
 AS line
 WITH line
 WHERE toFloat(line.method3) >= $x
 MERGE (obj1:Object {name: line.obj1})
 MERGE (obj2:Object {name: line.obj2})
 MERGE (obj1)-[:similar_on_method3]->(obj2)

1 个答案:

答案 0 :(得分:0)

我猜有多大?但是遵循这些思路的东西应该可以帮助您入门。如果obj1和obj2值在数据中重复或已经存在于数据库中,则需要在obj1和obj2值上创建一些索引。

LOAD CSV WITH HEADERS
FROM 'file:///objects.csv'
AS line
WITH line
WHERE toInteger(line.method1) >= $x
AND toInteger(line.method2) >= $y
AND toInteger(line.method3) >= $z
MERGE (obj1:Object {name: line.obj1})
MERGE (obj2:Object {name: line.obj2})
MERGE (obj1)-[:LINK]->(obj2)