我刚开始学习py2neo和neo4j,我遇到了重复这个问题。我在python中编写了一个简单的python脚本,它将创建一个科学论文和作者的数据库。我只需要添加论文和作者的节点并添加他们的关系。我正在使用这段代码,它工作正常,但速度很慢:
paper = Node('Paper', id=post_id)
graph.merge(paper)
paper['created_time'] = created_time
graph.push(paper)
for author_id,author_name in paper_dict['authors']:
researcher = Node('Person', id=author_id)
graph.merge(researcher)
researcher['name'] = author_name
graph.push(researcher)
wrote = Relationship(researcher,'author', paper)
graph.merge(wrote)
因此,为了同时编写多个关系,我尝试使用事务。我的问题是,如果我为同一篇论文和作者多次运行,它会假定它们是不同的实体,然后复制数据库中的每个节点和关系(我试图多次运行该脚本)。但是之前的代码并没有发生同样的情况。 这是使用事务的代码:
tx = graph.begin()
paper = Node('Paper', id=post_id)
paper['created_time'] = created_time
tx.create(paper)
for author_id,author_name in paper_dict['authors']:
researcher = Node('Person', id=author_id)
researcher['name'] = author_name
tx.create(researcher)
wrote = Relationship(researcher,'author', paper)
tx.create(wrote)
tx.commit()
答案 0 :(得分:1)
我相信你应该使用merge函数,而不是create函数来避免重复。 请考虑以下源代码:
import py2neo
from py2neo import Graph, Node, Relationship
def authenticateAndConnect():
py2neo.authenticate('localhost:7474', 'user', 'password')
return Graph('http://localhost:7474/default.graphdb/data/')
def actorsDictionary():
return
def createData():
graph = authenticateAndConnect()
tx = graph.begin()
movie = Node('Movie', title='Answer')
personDictionary = [{'name':'Dan', 'born':2001}, {'name':'Brown', 'born':2001}]
for i in range(10):
for person in personDictionary:
person = Node('Person', name=person['name'], born=person['born'])
tx.merge(person)
actedIn = Relationship(person, 'ACTED_IN', movie)
tx.merge(actedIn)
tx.commit()
if __name__ == '__main__':
for i in range(10):
createData()