在Python中过滤列表字典

时间:2017-05-29 14:49:38

标签: python dictionary

我有以下字典:

dict = {'Sex':['Male','Male','Female','Female','Male'],
        'Height': [100,200,150,80,90],
        'Weight': [20,60,40,30,30]}

我希望能够使用一个键上的条件过滤该词典。例如,如果我只想保留男性:

new_dict = {'Sex':['Male','Male','Male'],
            'Height': [100,200,90],
            'Weight': [20,60,30]}

7 个答案:

答案 0 :(得分:4)

您可以使用 dict comprehension 并在构建值列表时检查键'Sex'处相应索引处的项目:

d = {k: [x for i, x in enumerate(v) if dct['Sex'][i]=='Male'] 
                                      for k, v in dct.items()}
print(d)
# {'Sex': ['Male', 'Male', 'Male'], 
#  'Weight': [20, 60, 30], 
#  'Height': [100, 200, 90]}

答案 1 :(得分:3)

而不是试图跟踪索引,"转置"数据结构是字典列表:

data = [{'Sex': 'Male', 'Height': 100, 'Weight': 20},
        {'Sex': 'Male', 'Height': 200, 'Weight': 60},
        {'Sex': 'Female', 'Height': 150, 'Weight': 40},
        {'Sex': 'Female', 'Height': 80, 'Weight': 30},
        {'Sex': 'Male', 'Height': 90, 'Weight': 30}]

only_males = [person for person in data if person['Sex'] == 'Male']
only_males
# [{'Sex': 'Male', 'Height': 100, 'Weight': 20},
#  {'Sex': 'Male', 'Height': 200, 'Weight': 60},
#  {'Sex': 'Male', 'Height': 90, 'Weight': 30}]

答案 2 :(得分:1)

使用collections.defaultdictzip()功能的解决方案:

d = {
    'Sex':['Male','Male','Female','Female','Male'],
    'Height': [100,200,150,80,90],
    'Weight': [20,60,40,30,30]
}

result = collections.defaultdict(list)
for s,h,w in zip(d['Sex'], d['Height'], d['Weight']):
    if s == 'Male':
        result['Sex'].append(s)
        result['Height'].append(h)
        result['Weight'].append(w)

print(dict(result))

输出:

{'Sex': ['Male', 'Male', 'Male'], 'Weight': [20, 60, 30], 'Height': [100, 200, 90]}

答案 3 :(得分:0)

您可以使用itertools.compress和词典理解:

>>> import itertools

>>> dct = {'Sex':    ['Male', 'Male', 'Female', 'Female', 'Male'],
...        'Height': [100, 200, 150, 80, 90],
...        'Weight': [20, 60, 40, 30, 30]}

>>> mask = [item == 'Male' for item in dct['Sex']]

>>> new_dict = {key: list(itertools.compress(dct[key], mask)) for key in dct}
>>> new_dict
{'Height': [100, 200, 90],
 'Sex': ['Male', 'Male', 'Male'],
 'Weight': [20, 60, 30]}

答案 4 :(得分:0)

你可以使用pandas DataFrame(首先是install包)

>>> data = pandas.DataFrame(
   {'Sex':['Male','Male','Female','Female','Male'],
    'Height': [100,200,150,80,90],
    'Weight': [20,60,40,30,30]}
)

>>> data[data['Sex'] == 'Male']
   Height   Sex  Weight
0     100  Male      20
1     200  Male      60
4      90  Male      30

这将更像数据库,您可以毫不费力地过滤更多的东西。

答案 5 :(得分:0)

就个人而言,我会使用对象列表,以相同的方式在相同的对象中使用相关的属性:

people = [{"Sex": "Male", "Height": 100, "Weight": 20}, {...}, ...]

我会以这种方式转换为列表(假设词典中的列表都具有相同的大小):

list = []
for i in range(len(dict["Sex"])):
    list.append({k: v[i] for k, v in dict.iteritems()})

如果您使用的是python 3.x,请使用d.items()

然后,您可以按键值轻松过滤列表,更多详情here

答案 6 :(得分:0)

无论如何,我会把它放在这里。它会根据您的字典在内存中创建一个数据库,然后您可以根据需要灵活地查询该数据库以获得所需的结果。

dict_ = {'Sex': ['Male', 'Male', 'Female', 'Female', 'Male'],
        'Height': [100, 200, 150, 80, 90],
        'Weight': [20, 60, 40, 30, 30]}

import sqlite3

conn = sqlite3.connect(':memory:')
curs = conn.cursor()
column_headers = [x for x in dict_]  # the keys are the headers
column_types = ('' for x in dict_)
header_creation = ', '.join([' '.join(x) for x in zip(column_headers, column_types)])
curs.execute("CREATE TABLE temp ({})".format(header_creation))
bindings = ','.join('?' * (header_creation.count(',') + 1))
result_insertion = "INSERT INTO temp ({}) VALUES ({})".format(', '.join(column_headers), bindings)
for i, item in enumerate(dict_[column_headers[0]]):
    values = [item]
    for j in column_headers[1:]:
        values.append(dict_[j][i])
    curs.execute(result_insertion, values)
conn.commit()

condition = 'weight >= 40'

out = curs.execute('SELECT * FROM temp{}'.format(' WHERE {}'.format(condition) if condition else ';')).fetchall()
dict_out = {}
for i, k in enumerate(column_headers):
    dict_out[k] = [x[i] for x in out]
print(dict_out)  # {'Sex': ['Male', 'Female'], 'Weight': [60, 40], 'Height': [200, 150]}