我有两个对象列表,我需要根据两组不同的属性找到匹配的对象。假设,我有Vehicle()对象,我需要首先匹配第一个列表中所有与第二个列表中的车辆相等的车辆,首先查找匹配的颜色,然后查找匹配的品牌。 我有两种解决方案,但是我不确定这是否是我能做的最好的选择。 (我确实需要优化此性能)
所以说我有:
class Vehicle(object):
def __init__(self, color, brand):
self._color = color
self._brand = brand
以及这样的对象列表:
vehicles1= [Vehicle('blue','fiat'), Vehicle('red','volvo'), Vehicle('red','fiat')]
vehicles2 = [Vehicle('blue', 'volvo'), Vehicle('red', 'BMW')]
第一个似乎很慢的解决方案是仅通过列表包含项工作:
inersect_brand_wise = [x for x in vehicles1 for y in vehicles2 if x._brand == y._brand]
然后
intersect_color_wise = [x for x in vehicles1 for y in vehicles2 if x._color == y._color]
我提供的第二个解决方案是阐述平等:
class Vehicle(object):
def __init__(self, color, brand):
self._color = color
self._brand = brand
def __eq__(self, other):
if isinstance(other, Vehicle):
return self._brand == other._brand
return False
def __hash__(self):
return hash((self._color, self._brand))
现在,以交叉路口的方式获得路牌是微不足道的:
inersect_brand_wise = [x for x in vehicles1 if x in vehicles2]
为了使颜色相交,我做了以下工作:
class Car(Vehicle):
def __init__(self, color, brand):
Vehicle.__init__(self,color, brand)
def __hash__(self):
return Vehicle.__hash__
def __eq__(self, other):
if isinstance(other, Car):
return other._color == self._color
return False
def change_to_car(obj):
obj.__class__ = Car
return obj
cars1 = map(change_to_car, vehicles1)
cars2 = map(change_to_car, vehicles2)
因此,
intersect_color_wise = [x for x in cars1 if x in cars2]
给出第二个交点。
但是,在我看来,这是一种非常笨拙的处理方式,而我实际上需要在此方面表现出色。
关于如何做得更好的任何建议?
预先感谢, M
答案 0 :(得分:0)
在这种情况下效果如何?没有完整的数据来模拟性能以进行适当的测试...:
def get_intersections(list1, list2):
brands, colors = map(set, zip(*[(v._brand, v._color) for v in list2]))
inter_brands = [v for v in list1 if v._brand in brands]
inter_colors = [v for v in list1 if v._colors in colors]
return inter_brands, inter_colors
如果需要,您还可以编写单个交点:
from operator import attrgetter
def get_intersection(list1, list2, attr:str):
getter = attrgetter(attr)
t_set = {getter(v) for v in list2}
results = [v for v in list1 if getter(v) in t_set]
return results
# use it like this:
get_intersection(vehicles1, vehicles2, "_brand")
您还可以使用attrgetter
缩放第一个函数以获取任意数量的属性:
def get_intersections(list1, list2, *attrs:str):
getter = attrgetter(*attrs)
if len(attrs) > 1:
sets = list(map(set, zip(*[getter(v) for v in list2])))
else:
sets = [{getter(v) for v in list2}]
results = {attr: [v for v in vehicles1 if getattr(v, attr) in sets[s]] for s, attr in enumerate(attrs)}
return results
测试:
>>> get_intersections(vehicles1, vehicles2, "_brand", "_color")
{'_brand': [<__main__.Vehicle object at 0x03588910>], '_color': [<__main__.Vehicle object at 0x035889D0>, <__main__.Vehicle object at 0x03588910>, <__main__.Vehicle object at 0x035889F0>]}
>>> get_intersections(vehicles1, vehicles2, "_brand")
{'_brand': [<__main__.Vehicle object at 0x03588910>]}
答案 1 :(得分:0)
首先,据我所知,如果品牌和颜色相等,那么汽车就相等。通常,对有意义的代码进行编码(在此处奇怪的措辞):
class Vehicle(object):
def __init__(self, color, brand):
self.color = color #Why the underscores everywhere, e.g. _brand? those are usually indicative of something special-ish
self.brand = brand
def __eq__(self, other):
#Ducktyping. If it quacks like a car...:
return self.brand == other.brand and self.color == other.color
两个集合的交集是一个“难题”。我们能做的最好的事情就是Python offers(在set
下寻找交集),更糟糕的是,这是两个集合长度的倍数。当然,您不能在这里使用一套。
不过,就您而言,我认为问题比较简单。无需强迫理解。您想对两件事进行分类吗?自然地这样做,它将变得更快:
vcol = {'blue':set(), 'red':set() ... } #Or add in the loop below if list to long or uncertain
vbrand = {'fiat':set(), ... }
for v in cars1:
vcol[v.color].add(v)
vbrand[v.brand].add(v)
现在只需列出清单即可。请注意,下面的in
运算符是O(1)
:
colorbuddies = []
brandbuddies = []
#Could be split into two comprehensions. Not sure two runs is worth it.
for v in cars2:
#in operator depends on __eq__ defined above
if v in vcol[v.color]: colorbuddies.append(v)
if v in vbrand[v.brand]: brandbuddies.append(v)
总的来说,我们在两个列表上都有一个线性运算!
答案 2 :(得分:0)
实际上Python是一种动态语言。这意味着您可以随意对Vehicle
类进行修补,以使其适合您的需求。您还准备了另外两个分别具有品牌等同性和颜色等同性的类(我使它们成为Vehicle的子类,以便自动补全可以工作),然后将其成员分配给Vehicle类:
class Vehicle(object):
def __init__(self, color, brand):
self._color = color
self._brand = brand
class Vehicle_brand(Vehicle):
def __eq__(self, other):
return self._brand == other._brand
def __hash__(self):
return hash(self._brand)
class Vehicle_color(Vehicle):
def __eq__(self, other):
return self._color == other._color
def __hash__(self):
return hash(self._color)
获得品牌交集:
Vehicle.__eq__ = Vehicle_brand.__eq__
Vehicle.__hash__ = Vehicle_brand.__hash__
intersect_brand_wise = [x for x in vehicles1 if x in vehicles2]
然后获取颜色交集:
Vehicle.__eq__ = Vehicle_color.__eq__
Vehicle.__hash__ = Vehicle_color.__hash__
intersect_color_wise = [x for x in vehicles1 if x in vehicles2]
这里的好消息是,如果您的Vehicle
类具有其他成员,则在更改相等性部分时它们将保持不变,并且您绝不会复制或复制任何对象:类对象中只有2个方法。
它可能不是很纯正,但是应该可以工作...
答案 3 :(得分:0)
问题:检查跟随不同属性的对象是否相等
然后在进行loop in loop
时查找等于对象,而不是在实例化时检查平等。
要保存memory
,请仅保存对象的相等哈希。
使用class attributs
来保存dict
中objects
的{{1}}
和一个set 1
来保存{em>等于 list
的{{1}}。
_hash
将objects
对象的引用保存在class VehicleDiff:
ref1 = {}
_intersection = []
def __init__(self, set, color, brand):
self.color = color
self.brand = brand
中。
将set 1
中的 only 对象与dict ref1
进行检查,并仅在相等时保存。
set 2
助手dict ref1
从 _hash = hash((color, brand))
if set == 1:
VehicleDiff.ref1[_hash] = self
elif _hash in VehicleDiff.ref1:
VehicleDiff._intersection.append(_hash)
获取methode intersection
对象。
VehicleDiff
一_hash
对象的字符串表示。
@staticmethod
def intersection():
print('intersection:{}'.format(VehicleDiff._intersection))
for _hash in VehicleDiff._intersection:
yield VehicleDiff.ref1[_hash]
不需要将对象保存在VehicleDiff
中。
注意:由于给定的示例数据没有交集,因此我在
def __str__(self): return 'color:{}, brand:{}'.format(self.color, self.brand)
上添加了list
('red', 'fiat')
打印结果,如有。
set 2
输出:
for p in [('blue', 'fiat'), ('red', 'volvo'), ('red', 'fiat')]: VehicleDiff(1, *p) for p in [('blue', 'volvo'), ('red', 'BMW'), ('red', 'fiat')]: VehicleDiff(2, *p)
使用Python测试:3.4.2