字典处理列表 - 可读性和复杂性

时间:2017-09-25 20:58:19

标签: python list-comprehension code-readability

我有关于理解的基本问题。 有值列表的dicts列表,它看起来像这样:

listionary = [{'path': ['/tmp/folder/cat/number/letter', '/tmp/folder/hog/char/number/letter', '/tmp/folder/hog/number/letter', '/etc'], 
'mask': True, 
'name': 'dict-1'}, 
{'path': ['/tmp/folder/dog/number-2/letter-4', '/tmp/folder/hog-00/char/number-1/letter-5', '/tmp/folder/cow/number-2/letter-3'], 
'mask': True, 
'name': 'dict-2'}, 
{'path': ['/tmp/folder/dog_111/number/letter', '/tmp/folder/ant/char/number/letter', '/tmp/folder/hen/number/letter'], 
'mask': True, 
'name': 'dict-3'}]

我需要的是从每个独特动物的列表类型值中获取。 动物总是在 tmp / folder / 和下一个 / 之间。 我做了什么:

import re
flat_list = [item for sublist in [i['path'] for i in listionary] for item in sublist]
animals = list(set([re.search('folder/([a-z]+)', elem).group(1) for elem in flat_list if 'tmp' in elem]))

它也可能被压缩成一行,但它非常复杂且难以理解:

animals = list(set([re.search('folder/([a-z]+)', elem).group(1) for elem in [item for sublist in [i['path'] for i in listionary] for item in sublist] if 'tmp' in elem]))

是否有关于理解大小的黄金法则(例如蟒蛇的禅宗)? 我怎样才能让它变得更好?提前谢谢。

4 个答案:

答案 0 :(得分:1)

如何让它变得更好?

  1. 让其他人阅读。 ✓
  2. 使用函数来封装更复杂的操作
  3. 不要在同一行上嵌套循环
  4. 以下是我如何分解最后两点......

    def get_animals(d):
        animals = []
        for item in d['path']:
            if item.startswith('/tmp/folder/'):
                animals.append(item[12:item.find('/',12)])
        return animals
    
    animals = set()
    for d in dlist:
        animals.update(get_animals(d))
    animals = list(animals)

答案 1 :(得分:0)

你可以试试这个:

listionary = [{'path': ['/tmp/folder/cat/number/letter', '/tmp/folder/hog/char/number/letter', '/tmp/folder/hog/number/letter', '/etc'], 
'mask': True, 
'name': 'dict-1'}, 
 {'path': ['/tmp/folder/dog/number-2/letter-4', '/tmp/folder/hog-00/char/number-1/letter-5', '/tmp/folder/cow/number-2/letter-3'], 
'mask': True, 
'name': 'dict-2'}, 
{'path': ['/tmp/folder/dog_111/number/letter', '/tmp/folder/ant/char/number/letter', '/tmp/folder/hen/number/letter'], 
'mask': True, 
'name': 'dict-3'}]
import re
from itertools import chain
animals = list(set(chain.from_iterable([[re.findall("/tmp/folder/(.*?)/", b)[0] for b in i["path"] if re.findall("/tmp/folder/(.*?)/", b)] for i in listionary])))

输出:

['hog', 'hog-00', 'cow', 'dog_111', 'dog', 'cat', 'ant', 'hen']

答案 2 :(得分:0)

您可以通过添加换行符和缩进来使其更具可读性。我停在item for sublist...的位置,因为我不了解代码逻辑,但可能会在那里添加更多新行。

animals = list(
    set([
            re.search('folder/([a-z]+)', elem).group(1) for elem in [
                item for sublist in [i['path'] for i in listionary] for item in sublist
            ]
            if 'tmp' in elem
    ])
)

那就是说,我会认为这样的东西更具可读性:

def animal_name_from_path(path):
    return re.search('folder/([a-z]+)', path).group(1)

def is_animal_path(path):
    return '/tmp' in path

def deduplicate(L):
    return list(set(L))

path_list = []
for item in listionary:
    path_list.extend(item['path'])

animals = deduplicate([animal_name_from_path(path) for path in path_list if is_animal_path(path)])

这里应用的一条经验法则是任何概念都应该有一个名称。在原始代码中,item for sublist in [i['path'] for i in listionary] for item in sublist很难理解,因为它不清楚itemi应该是什么。在这个新的块中,您更清楚的是,您只是将路径列表展平。一旦命名了所有概念,动物名称识别代码就更容易理解。在这里,我可能已经把它带到了一个极端 - 你可以找到你自己的快乐均衡,你发现最可读。

答案 3 :(得分:0)

缩短的解决方案:

animals = set(re.search(r'/folder/([a-z]+)', p).group(1) for d in listionary for p in d['path'] if '/tmp' in p)
print(animals)

输出:

{'hog', 'cat', 'dog', 'cow', 'hen', 'ant'}