我想将数据导入我的Neo4j数据库。 根据我的原始数据,我生成了很多密码。
例如,我有一个这样的密码列表(最多十万个):
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'worst phone ever'})
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'cheapest phone ever'})
MERGE (product:PRODUCT{name:'Y phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'even worse than phone X'})
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'better than newly release Y version'})
我当前的解决方案是使用Python中的Neo4j驱动程序逐行从文件运行密码。
from neo4j.v1 import GraphDatabase
import sys
class CypherClient:
"""
The client that execute cypher
"""
def __init__(self, uri, auth):
self.driver = GraphDatabase.driver(uri, auth=auth)
def run_cypher(self, cypher):
"""
execute single cypher
:param cypher: the cypher in str
:return: no return anything at all
"""
with self.driver.session() as session:
session.run(cypher).single()
if __name__=="__main__":
"""
execute cypher from file
each line is independent cypher
python exec_cypher_file.py outcypher.txt
"""
# replace URI and authentication here
uri = "bolt://localhost:7687"
auth = ("neo4j", "IAmPusheenTheCat")
counter = 0
if len(sys.argv) < 2:
test()
else:
client = CypherClient(uri, auth)
infile = sys.argv[1]
errfile = open(infile+".err.txt", 'w')
for line in open(infile):
# print(line)
try:
client.run_cypher(line)
except:
print(str(counter) + " " + line+"\n")
errfile.write(str(counter) + " " + line+"\n")
counter+=1
if counter % 100 == 0 or counter < 100:
print(counter)
errfile.close()
print('done')
我该怎么做才能提高运行大密码的效率?
答案 0 :(得分:1)
CSV加载往往非常高效,因此,如果您以CSV格式存储数据,则可以使用LOAD CSV。
否则,您可以查看Michael Hunger在effective batch updates上的文章,该文章使用UNWIND批量处理输入列表。