Question

我已经定义了一个包含多个字段的自定义对象。

例如，假设我有一个Student对象，它由名称，ID和年龄组成。为了比较两个学生并确定他们是否是同一个学生，我实施了__ eq__方法，该方法将返回两个学生的年龄，姓名和ID是否匹配。

def __eq__(self, other):
   return self.name == other.name and self.ID == other.ID and self.age == other.age

请记住，学生只是一个例子，因此不考虑学生ID往往是独一无二的事实。

假设我有以下注册列表，其中包含任意数量的学生对象

[S1, S2, S3]
[S2, S3]
[S3, S5, S4]
[S1, S4, S2, S1]

我想创建一些包含以下元素的数据结构

S1, S2, S3, S4, S5

最简单的方法是初始化一些可以容纳大量内容的数据结构，抓取一个项目，检查它是否存在于结构中，如果不存在则添加它。

new_list = some_new_list 
for each list of students:
  for each student in the list:
     check if the student is in new_list
     #decide what to do

如果我决定将其作为一个简单的列表来实现，我可能会进行大量的比较，因为我的列表会继续增长，特别是如果我有大量的学生和注册列表。

实施此方法的有效方法是什么？两者都用于比较两个对象，然后使用该比较方法生成一组唯一的对象。

编辑：所以我尝试了一个简单的设置实现。

>>>a = Student("sample", 1234, 18)
>>>b = Student("sample", 1234, 18)
>>>students = set()
>>>students.add(a)
>>>b in students
False
>>>b == a
True

我做错了吗？

Answer 1

from itertools import chain
myset = set(chain(iterable1, iterable2, iterable3, iterable4))

您获得了唯一的项目，并且您只迭代每次迭代一次。 chain从一系列迭代中进行一次长迭代。如果您需要对其进行排序，sorted(myset)将为您提供排序列表。

您的Student课程需要实施与其__hash__兼容的__eq__：

def __hash__(self):
    return (self.name, self.ID, self.age).__hash__()

Answer 2

set 不保证保持顺序。如果您需要保留订单的列表：

import itertools
from typing import List

def unique_items(*lists: List) -> List:
    """Return an order-preserving list of unique items from the given lists.

    The implemented approach requires that the input items are hashable.

    Example: unique_items([1,9,4], [2,4,6,8,8], [3,1]) -> [1, 9, 4, 2, 6, 8, 3]

    Ref: https://stackoverflow.com/a/68626841/
    """
    return list(dict.fromkeys(itertools.chain(*lists)))

Answer 3

我只能说一句话。

组

Here are the docs for sets

从多个列表创建唯一的对象列表

3 个答案: