Question

我有食物和餐馆的物品集合，我需要将所有对象食物对象与相应的餐馆相匹配。我实施了一个天真的解决方案，其时间复杂度为O（n * m），分别为食物和餐馆数据库的n和m大小。

def match_products(self):
   self._restaurant_dict= self._init_restaurant_dict()
   for food in foods():
        for restaurant in self._restaurant_dict.keys():
            if self._matched(restaurant , food ):
                self.mached_candidates[restaurant].append(food)

def _init_restaurant_dict(self):
    res_dict= {}
    for product in restaurants():
        res_dict[restaurant] = []
    return res_dict

def _matched(self, restaurant , food ):
    return restaurant.id == food.id

餐厅和食物的定义如下：

class Structure:
    _fields = []
    def __init__(self, *args):
        if len(args) != len(self._fields):
            raise TypeError("Wrong args number")
        for name, val in zip(self._fields,args):
            setattr(self, name, val)

    def __repr__(self):
        return ', '.join("%s: %s" % item for item in vars(self).items())

class Restaurant(Structure):
    _fields = ["id","name","owner"]

class Food(Structure):
    _fields = ["id","descriptions","calories"]

方法食品（）和餐馆（）是发电机。那么我怎样才能加速这个算法呢？

Answer 1

使用id作为查找表的哈希值。

lookup_table = dict()
for food in foods():
  if food.id not in lookup_table:
    lookup_table.update({food.id: [food]})
  else:
    lookup_table[food.id].append(food)
matched_candidates = {restaurant : lookup_table.get(resturant.id, []) for restaurant in restaurants()}

或类似的东西。 O（N + M）

Answer 2

好的，为了澄清，我假设您希望能够通过餐馆ID和食物名称的第一个字符来选择食物。所以，说“Papa-hut”的id为42而你想要一个“披萨”，你可以通过键42p来查找它为什么这样做？因为，我希望restaurant.id字段是唯一标识符，并且连接到字符串的任何唯一字符串仍然是唯一的。因此，使restaurant.id字段复杂化将为查找表提供更具体的搜索。但是，需要更多的访问来获取食物。你可以尝试这种权衡。 Wiki on hash tables Advantages/Drawbacks

matched_candidates = dict()
for food in foods():
  if food.id not in lookup_table:
    matched_candidates .update({''.join(food.id, food.name[0].lower()): [food]})
  else:
    matched_candidates [food.id].append(food)

  matched_candidates.update({ restaurant : [] 
                            for restaurant in restaurants()
                            if restaurant not in matched_candidates.keys()
                           })

更新是在food（）发生器中添加可能没有任何食物的resturants。这仍然是O（N + M）。

我必须诚实地说，这对我来说是错误的。这有点需要食物和餐馆的特殊知识才能进入餐桌。但是，查找速度很快，所以也许这就是你所关心的。

有条件地匹配python中的两个数据库

2 个答案: