我应该如何将neo4j Cypher / Apoc负载转换为neo4j-admin导入?

时间:2019-12-03 20:13:22

标签: neo4j cypher py2neo neo4j-apoc

我正在处理电子邮件数据,并使用python解析它,每小时生成一个csv。使用该csv,我有5个单独的load csv commands来创建/更新节点和关系。它们是NO ATTACHMENT OR LINKURL ONLYATTACHMENT ONLYURL AND ATTACHMENTAttachment to Attachment Name, FileName Node

我想通过批处理作业自动导入这些文件。由于我的熟悉程度,我只想用python来做,但是我一直在堆栈和其他地方四处寻找,因此人们建议使用neo4j-admin import。从文档中看,它与我对--nodes--relationships所做的工作非常不同。谁能帮助我展示如何将下面创建的CYPHER/APOC LOAD CSV示例转换为noe4j-admin import

// URL AND ATTACHMENT
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/4_hour_parsed_and_ready.csv") AS row
MERGE (a:Sender { name: row.From, domain: row.Sender_Sub_Fld})
MERGE (b:Link { name: row.Url_Sub_Fld, topLevelDomain: row.Url_Tld, htmlEncodedMessage: row.HTML_Encoded})
MERGE (c:Attachment { name: row.FileHash, fileExtension: row.FileName_Ext, containsMultipleExtensions: row.MultipleExtensions})
MERGE (d:Recipient { name: row.To})
WITH a,b,c,d,row
WHERE NOT row.Url_Tld = "false" AND NOT row.FileHash = "false"
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, b) YIELD rel as rel1
CALL apoc.merge.relationship(b, row.Outcome2, {}, {}, d) YIELD rel as rel2
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, c) YIELD rel as rel3
CALL apoc.merge.relationship(c, row.Outcome2, {}, {}, d) YIELD rel as rel4
RETURN a,b,c,d

或者如何将这段代码包装在py2neo中。

1 个答案:

答案 0 :(得分:0)

我刚刚创建了一个保存服务器连接信息的函数,并将所有内容包装在py2neo查询中,然后执行。

import py_2_neo_pass
from py_2_neo_pass import db_server, db_user, db_password
from py2neo import Graph, Node, Relationship

graph = Graph(ip_addr = db_server, username = db_user, password = db_password)

query='''
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/4_hour_parsed_and_ready.csv") AS row
MERGE (a:Sender { name: row.From, domain: row.Sender_Sub_Fld})
MERGE (b:Link { name: row.Url_Sub_Fld, topLevelDomain: row.Url_Tld, htmlEncodedMessage: row.HTML_Encoded})
MERGE (c:Attachment { name: row.FileHash, fileExtension: row.FileName_Ext, containsMultipleExtensions: row.MultipleExtensions})
MERGE (d:Recipient { name: row.To})
WITH a,b,c,d,row
WHERE NOT row.Url_Tld = "false" AND NOT row.FileHash = "false"
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, b) YIELD rel as rel1
CALL apoc.merge.relationship(b, row.Outcome2, {}, {}, d) YIELD rel as rel2
CALL apoc.merge.relationship(a, row.Outcome, {}, {}, c) YIELD rel as rel3
CALL apoc.merge.relationship(c, row.Outcome2, {}, {}, d) YIELD rel as rel4
RETURN a,b,c,d
'''

graph.run(query)