Question

我有一些包含字符串的元组，我想删除其中包含3个或更多相同元素的一些元组。所以我需要检查是否有任何元组中有3个或多个'A'，'B'，'C'或'D'。我该怎么办？谢谢

('A', 'A', 'A', 'A') <--remove because it has more than 3 'A's in it
('A', 'A', 'A', 'B') <--remove because it has 3 'A's in it
('B', 'B', 'B', 'B') <--remove because it has more than 3 'B's in it
('B', 'B', 'B', 'C') <--remove because it has 3 'B's in it
('A', 'A', 'B', 'A') <--remove because it has 3 'A's in it
('A', 'A', 'B', 'B') <--this is ok
('A', 'A', 'B', 'C') <--this is ok
('A', 'A', 'B', 'D') <--this is ok

Answer 1

您可以使用collections.Counter来计数元素的出现次数：

String firstDate;
            String secondDate;
            double people;
            Console.WriteLine("Please Enter the number of people:")
            people = Convert.ToDouble(Console.ReadLine());
            double flightPrice = 238;
            Console.Write("Please enter the arrival date (dd-MM-yyyy):");
            firstDate = Console.ReadLine();
            Console.Write("Please enter the departure date (dd-MM-yyyy):");
            secondDate = Console.ReadLine();

            if (firstDate >= "15-06-2018" && secondDate <= "15-08-2018")
            {
                flightPrice = 238 * 1.20 * people;
            }
            else
            {
                flightPrice = 238 * people;
            }

            Console.ReadLine();

输出

from collections import Counter

data = [('A', 'A', 'A', 'A'),
        ('A', 'A', 'A', 'B'),
        ('B', 'B', 'B', 'B'),
        ('B', 'B', 'B', 'C'),
        ('A', 'A', 'B', 'A'),
        ('A', 'A', 'B', 'B'),
        ('A', 'A', 'B', 'C'),
        ('A', 'A', 'B', 'D')]


result = [t for t in data if all(value < 3 for value in Counter(t).values())]
print(result)

如@coldspeed所述，您无需仅测试最大的一个就可以测试所有值：

[('A', 'A', 'B', 'B'), ('A', 'A', 'B', 'C'), ('A', 'A', 'B', 'D')]

Answer 2

编辑：此解决方案涉及额外的计算，并且对值的效率不如max。避免。请参阅评论以进行出色的讨论。
您可以使用collections.Counter，但可以使用计数器的most common方法来避免检查Counter中的所有值。（编辑：但是，最常见的是在传递参数时需要进行堆排序，从而使其计算量大。感谢您在注释中指出该点。）

from collections import Counter

data = [('A', 'A', 'A', 'A'),
        ('A', 'A', 'A', 'B'),
        ('B', 'B', 'B', 'B'),
        ('B', 'B', 'B', 'C'),
        ('A', 'A', 'B', 'A'),
        ('A', 'A', 'B', 'B'),
        ('A', 'A', 'B', 'C'),
        ('A', 'A', 'B', 'D')]


result = [t for t in data if Counter(t).most_common(1)[0][1] < 3]
print(result)

Answer 3

您无需测试所有值。您只能测试最大的一个。

result = [i for i in data if max(Counter(i).values()) < 3]

输出

[('A', 'A', 'B', 'B'), ('A', 'A', 'B', 'C'), ('A', 'A', 'B', 'D')]

如何过滤出不符合所需条件的字符串列表？

3 个答案: