分类,层次顶点,我如何获取顶级父级和选择性子级?

时间:2017-08-09 08:44:08

标签: python hierarchy graph-databases taxonomy arangodb

我非常喜欢arangodb的图形遍历,它允许我访问任何路径或节点,只有很少的查询冒险。但是,我已经堆积了已经在neo4j中实现的上下文,我相信任何使用arangodb的人都会发现这对他未来的操作很有用。

我已成功将产品类别列表google product taxonomy导入arangodb数据库。在名为taxonomy的顶点集合和名为catof的边集合中。

如果我正确,从这个查询中,我能够获取所有顶点和链接边。

FOR t IN taxonomy
    for c in inbound t catof
    sort c.name asc
    return {c}

在提供分类学文档时,如果_from_to部分中的任何一个为空,则父顶点没有边缘。我需要提一下,我使用flask-script和python-arango继续进行这些操作,他们一直很有帮助。

manager = Manager(app)
tax_item = storegraph.vertex_collection('taxonomy')
catof = storegraph.edge_collection('catof')

@manager.command
def fetch_tree():

    dictionary = {}

    with open('input.csv') as file:

        for row in file.readlines():

            things = row.strip().split(' > ')
            dictionary[things[0]] = None

            i, j = 0, 1

            while j < len(things):
                parent, child = things[i], things[j]
                dictionary[child] = parent

                i += 1
                j += 1


    # for key in dictionary:
    #tax_item.insert({"name": key})


    for child, parent in dictionary.iteritems():
        # edge_collection.insert_edge({from: vertex_collection / 
        # parent, to: vertex_collection / child})

        chl, par = tax_item.find({'name': child}), 
        tax_item.find({'name': parent})
        c, p = [h for h in chl], [a for a in par]

        if c and p:
        #print 'Child: %s parent: %s' % (c[0]['_id'], p[0]['_id'])
            catof.insert({'_from': c[0]['_id'], '_to': p[0]['_id'] })
            #print '\n'

操作后,我有以下样本顶点。

[{"_key": "5246198", "_id": "taxonomy/5246198","name": "Computers"},
  {"_key": "5252911", "_id": "taxonomy/5252911","name": "Hardwares"},
  {"_key": "5257587", "_id": "taxonomy/5257587", "name": "Hard disk"
  }]

和边缘

[
{ "_key": "5269883", "_id": "catof/5269883", "_from": "taxonomy/5246198", "_to": "taxonomy/5252911"},
{"_key": "5279833", "_id": "catof/5279833", "_from": "taxonomy/5252911",
"_to": "taxonomy/5257587"}]

现在我的问题是: 如何仅获取父文档?即Computers 从父文件中,我如何打印所有孩子?格式为ComputersHardwaresHard Disks

0 个答案:

没有答案