数据集:
data = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'},
{'name':'louis', 'attack':21, 'defense': 12, 'country':'france'},
{'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]
header = ['name', 'attack', 'defense', 'country']
filter_options = {'attack':4, 'defense':7, 'country':'Germany'}
我想编写一个函数,其中data
是参数,filter_options
是函数的参数。即func(data, filter_options)
filter_options
将按字符串类型值的完全匹配进行过滤,和/或过滤连续变量指定值,该值大于或等于字典键参数。即我的回答应该是
answer = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'},
{'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]
我目前的代码:
search_key_list = [key for key in filter_options.keys()]
header_index_list = [header.index(i) for i in search_key_list if i in header]
answer = []
for i in header_index_list:
for d in data:
if type(filter_options[header[i]]) == int or type(filter_options[header[i]]) == float:
if data[header[i]]>filter_options[header[i]]:
answer.append(d)
elif type((filter_options[header[i]])) == str:
if data[header[i]] == filter_options[header[i]]:
answer.append(d)
代码错误,因为它没有考虑多个标准。它正在查看一个标准,检查哪个子列表符合条件,将子列表附加到答案列表,然后继续下一个标准。
我该如何纠正?或者其他什么代码可以使用?
答案 0 :(得分:0)
您需要检查所有"过滤器"并且只有在所有过滤器与数据集匹配时才附加它:
data = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'},
{'name':'louis', 'attack':21, 'defense': 12, 'country':'france'},
{'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]
header = ['name', 'attack', 'defense', 'country']
filter_options = {'attack':4, 'defense':7, 'country':'Germany'}
def filter_data(data, filter_options):
answer = []
for data_dict in data:
for attr, value in filter_options.items(): # or iteritems
if isinstance(value, (int, float)): # isinstance is better than "type(x) == int"!
if data_dict[attr] < value: # check if it's NOT a match
break # stop comparing that dictionary
elif isinstance(value, str):
if data_dict[attr] != value:
break
# If there was no "break" during the loop the "else" of the loop will
# be executed
else:
answer.append(data_dict)
return answer
>>> filter_data(data, filter_options)
[{'attack': 5, 'country': 'Germany', 'defense': 10, 'name': 'kelly'},
{'attack': 43, 'country': 'Germany', 'defense': 9, 'name': 'ann'}]
这里的技巧是它检查它是否更小(如果它是一个整数)或不相等(对于字符串)然后立即停止比较该字典和当循环不是时break
ed,然后才会附加字典。
不使用else
子句进行循环的另一种方法是:
def is_match(single_data, filter_options):
for attr, value in filter_options.items():
if isinstance(value, (int, float)):
if single_data[attr] < value:
return False
elif isinstance(value, str):
if single_data[attr] != value:
return False
return True
def filter_data(data, filter_options):
answer = []
for data_dict in data:
if is_match(data_dict, filter_options):
answer.append(data_dict)
return answer
filter_data(data, filter_options)
你也可以使用生成器函数而不是手动追加(基于第一种方法):
def filter_data(data, filter_options):
for data_dict in data:
for attr, value in filter_options.items():
if isinstance(value, (int, float)):
if data_dict[attr] < value:
break
elif isinstance(value, str):
if data_dict[attr] != value:
break
else:
yield data_dict
return answer
然而,这需要在之后投射list
:
>>> list(filter_data(data, filter_options))