双向搜索的高效结构

时间:2015-03-27 19:45:27

标签: python search optimization structure bigdata

我有5个对象:会话,查询,结果,动作,POI。 因此,在加载值之后,我创建了一个单独的对象列表(sessions_list,queries_lis,...)。 然后,我应该将这些列表关联如下: 每个会话都有一个查询列表, 每个查询都有一个结果列表, 每个结果可能都有一个动作列表, 并且每个动作都可以有POI。 大概我有400K会话,700K查询,等等。 之后,我需要向两个方向搜索;例如,属于会话的行动或具有此行动的会话,.... 如何定义我的结构,我的数据大小和迭代类型?

class Session:

    def __init__(self, session_id, duration_seconds, device_name, start_time):
        self.session_id = session_id
        self.duration_seconds = duration_seconds
        self.device_name = device_name
        self.start_time_utc = start_time
        self.queries = []
        self.result_ids = []
        self.action_ids = []

class Query:
    def __init__(self, search_id, action_date, query_id, language, lat, longitude, total_results, query, country, country_subdivision, start_results, session_id):

        self.search_id = search_id
        self.action_date = action_date
        self.query_id = query_id
        self.results = []
        self.action_ids = []
        self.language = language
        self.latitude = lat
        self.longitude = longitude
        self.total_results = total_results
        self.query = query
        self.country = country
        self.country_subdivision = country_subdivision
        self.start_results = start_results
        self.session_id = session_id

class Result:
    def __init__(self, result_id, place_id, search_id, rank_of_result):
        self.result_id = result_id
        self.place_id = place_id
        self.search_id = search_id
        self.query_id = []
        self.rank_of_result = rank_of_result
        self.actions = []
        self.session_ids = []


class Action:
    def __init__(self, action_id, date, act_type, place_name, query_id, rank_of_result, place_id, session_id):
        self.action_id = action_id
        self.action_date_utc = date
        self.action_type = act_type
        self.place_name = place_name
        self.query_id = query_id
        self.rank_of_result = rank_of_result
        self.place_id = place_id
        self.session_id = session_id
        self.search_ids = []
        self.results_ids = []
        self.pois = []


class POI:
    def __init__(self, place_id, latitude, longitude, name, country):
        self.place_id = place_id
        self.latitude = latitude
        self.longitude = longitude
        self.name = name
        self.country = country

0 个答案:

没有答案