我非常喜欢arangodb的图形遍历,它允许我访问任何路径或节点,只有很少的查询冒险。但是,我已经堆积了已经在neo4j中实现的上下文,我相信任何使用arangodb的人都会发现这对他未来的操作很有用。
我已成功将产品类别列表google product taxonomy导入arangodb数据库。在名为taxonomy
的顶点集合和名为catof
的边集合中。
如果我正确,从这个查询中,我能够获取所有顶点和链接边。
FOR t IN taxonomy
for c in inbound t catof
sort c.name asc
return {c}
在提供分类学文档时,如果_from
,_to
部分中的任何一个为空,则父顶点没有边缘。我需要提一下,我使用flask-script和python-arango继续进行这些操作,他们一直很有帮助。
manager = Manager(app)
tax_item = storegraph.vertex_collection('taxonomy')
catof = storegraph.edge_collection('catof')
@manager.command
def fetch_tree():
dictionary = {}
with open('input.csv') as file:
for row in file.readlines():
things = row.strip().split(' > ')
dictionary[things[0]] = None
i, j = 0, 1
while j < len(things):
parent, child = things[i], things[j]
dictionary[child] = parent
i += 1
j += 1
# for key in dictionary:
#tax_item.insert({"name": key})
for child, parent in dictionary.iteritems():
# edge_collection.insert_edge({from: vertex_collection /
# parent, to: vertex_collection / child})
chl, par = tax_item.find({'name': child}),
tax_item.find({'name': parent})
c, p = [h for h in chl], [a for a in par]
if c and p:
#print 'Child: %s parent: %s' % (c[0]['_id'], p[0]['_id'])
catof.insert({'_from': c[0]['_id'], '_to': p[0]['_id'] })
#print '\n'
操作后,我有以下样本顶点。
[{"_key": "5246198", "_id": "taxonomy/5246198","name": "Computers"},
{"_key": "5252911", "_id": "taxonomy/5252911","name": "Hardwares"},
{"_key": "5257587", "_id": "taxonomy/5257587", "name": "Hard disk"
}]
和边缘
[
{ "_key": "5269883", "_id": "catof/5269883", "_from": "taxonomy/5246198", "_to": "taxonomy/5252911"},
{"_key": "5279833", "_id": "catof/5279833", "_from": "taxonomy/5252911",
"_to": "taxonomy/5257587"}]
现在我的问题是:
如何仅获取父文档?即Computers
从父文件中,我如何打印所有孩子?格式为Computers
,Hardwares
,Hard Disks