Question

我有以下分层实体：

Country, City, Street.
Every country has cities, every city has streets.

我为我想要运行的内容编写了伪代码：

handle_country_before(country)
for city in country:
    handle_city_before(city)
    for street in city:
        handle_street_before(street)
        handle_street_after(street)
    handle_city_after(city)
handle_country_after(country)

我尝试了以下方法：

document（noSql）方法：

我以平面方式保存了所有数据：

{
    country : { 
        # Country "x1" info corresponding with the street
    },
    city : {
        # City "y1" info corresponding with the street
    },
    street : {
        # street info...
    }
}

{
    country : { 
        # Country "x1" info corresponding with the street
    },
    city : {
        # City "y1" info corresponding with the street
    },
    street : {
        # street info...
    }
}

{
    country : { 
        # Country "x1" info corresponding with the street
    },
    city : {
        # City "y1" info corresponding with the street
    },
    street : {
        # street info...
    }
}

{
    country : { 
        # Country "x1" info corresponding with the street
    },
    city : {
        # City "y2" info corresponding with the street
    },
    street : {
        # street info...
    }
}

{
    country : { 
        # Country "x1" info corresponding with the street
    },
    city : {
        # City "y2" info corresponding with the street
    },
    street : {
        # street info...
    }
}

使用这种方法，我不得不使用以下伪代码：

    last_country = 0
    last_city = 0
    last_street = 0
    for element in elements:
        if element.country.id != last_country_id:
            if (0 != last_country) :
                handle_country_after(last_country)
            handle_country_before(element.country)
        if element.city.id != city:
            if (0 != last_city) :
                handle_country_after(last_city)
            handle_country_before(element.city)     
        if element.street.id != street:
            if (0 != last_street) :
                handle_country_after(last_street)
            handle_country_before(element.street)

缺点：我觉得这种方法有点过度杀戮而且扁平结构的使用不适合我的情况，而且它非常慢且空间效率低。

SQL方法：

我将每个实体保存在一个表格中：Country，City，Street，并使用以下代码对其进行迭代：

    country_cursor = query('select * from countries')
    for country in country_cursor:
        handle_country_before(country)
        city_cursor = query('select * from cities where parent_country_ref=%s' % (country.id))
        for city in city_cursor:
            street_cursor = query('select * from streets where parent_city_ref=%s' % (city.id))
            ...
            ...
        ...
        handle_country_after(country)

一开始，它看起来是最好的方法。但是当我添加更多元数据表并且不得不使用JOIN语句时，它变得越来越慢，然后我尝试使用物化视图来加速一些事情，但得到的结果与使用文档相同。

自定义格式方法：

我尝试以自己的二进制序列化格式保存信息：

<number of countries>[1st-country-data]<number of citieis>[1nd-city-data]<number of streets>[1st-street-data][2nd-street-data][3rd-street...]...

缺点：这无法扩展，我无法更新信息，我无法获取特定的城市/街道，每次搜索都是O（n）。

我正在寻找是一个序列化格式/ DB将是：

能够为现有元素添加/更新字段
速度，空间和记忆效率
符合C标准（无CPP）

Answer 1

在您的情况下，最快的方法是使用非规范化数据，创建包含以下信息的平面文件/表：

country    city    street    current mayor    additional information

当然数据应该排序，当然你不会使用沉重的解析格式，如json / xml，只有纯文本

然后您将能够使用单循环迭代此数组

为了加快迭代速度，您可以尝试将单个文件拆分为多个具有固定宽度行的文件

用于保存分层元素的最佳数据结构/ db / binary格式

我尝试了以下方法：

1 个答案: