Question

数据集：

data = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'}, 
        {'name':'louis', 'attack':21, 'defense': 12, 'country':'france'}, 
        {'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]

header = ['name', 'attack', 'defense', 'country']

filter_options = {'attack':4, 'defense':7, 'country':'Germany'}

我想编写一个函数，其中data是参数，filter_options是函数的参数。即func(data, filter_options)

filter_options将按字符串类型值的完全匹配进行过滤，和/或过滤连续变量指定值，该值大于或等于字典键参数。即我的回答应该是

answer = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'},
          {'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]

我目前的代码：

search_key_list = [key for key in filter_options.keys()]
header_index_list = [header.index(i) for i in search_key_list if i in header]

answer = []
for i in header_index_list:
    for d in data:
        if type(filter_options[header[i]]) == int or type(filter_options[header[i]]) == float:
            if data[header[i]]>filter_options[header[i]]:
                answer.append(d)
        elif type((filter_options[header[i]])) == str:
            if data[header[i]] == filter_options[header[i]]:
                answer.append(d)

代码错误，因为它没有考虑多个标准。它正在查看一个标准，检查哪个子列表符合条件，将子列表附加到答案列表，然后继续下一个标准。

我该如何纠正？或者其他什么代码可以使用？

Answer 1

您需要检查所有＆＃34;过滤器＆＃34;并且只有在所有过滤器与数据集匹配时才附加它：

data = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'}, 
        {'name':'louis', 'attack':21, 'defense': 12, 'country':'france'}, 
        {'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]

header = ['name', 'attack', 'defense', 'country']

filter_options = {'attack':4, 'defense':7, 'country':'Germany'}


def filter_data(data, filter_options):
    answer = []
    for data_dict in data:
        for attr, value in filter_options.items():  # or iteritems
            if isinstance(value, (int, float)):     # isinstance is better than "type(x) == int"!
                if data_dict[attr] < value:         # check if it's NOT a match
                    break                           # stop comparing that dictionary
            elif isinstance(value, str):
                if data_dict[attr] != value:
                    break
        # If there was no "break" during the loop the "else" of the loop will
        # be executed
        else:
            answer.append(data_dict)
    return answer


>>> filter_data(data, filter_options)
[{'attack': 5, 'country': 'Germany', 'defense': 10, 'name': 'kelly'},
 {'attack': 43, 'country': 'Germany', 'defense': 9, 'name': 'ann'}]

这里的技巧是它检查它是否更小（如果它是一个整数）或不相等（对于字符串）然后立即停止比较该字典和当循环不是时break ed，然后才会附加字典。

不使用else子句进行循环的另一种方法是：

def is_match(single_data, filter_options):
    for attr, value in filter_options.items():
        if isinstance(value, (int, float)):
            if single_data[attr] < value:
                return False
        elif isinstance(value, str):
            if single_data[attr] != value:
                return False
    return True

def filter_data(data, filter_options):
    answer = []
    for data_dict in data:
        if is_match(data_dict, filter_options):
            answer.append(data_dict)
    return answer

filter_data(data, filter_options)

你也可以使用生成器函数而不是手动追加（基于第一种方法）：

def filter_data(data, filter_options):
    for data_dict in data:
        for attr, value in filter_options.items():
            if isinstance(value, (int, float)):
                if data_dict[attr] < value: 
                    break          
            elif isinstance(value, str):
                if data_dict[attr] != value:
                    break
        else:
            yield data_dict
    return answer

然而，这需要在之后投射list：

>>> list(filter_data(data, filter_options))

在Python中的列表中搜索多个条件

1 个答案: