Question

在目录图像中，图像的命名方式为- 1_foo.png ， 2_foo.png ， 14_foo.png ，等等

图像经过OCR处理，文本摘录通过以下代码存储在dict中-

data_dict = {}

for i in os.listdir(images):
    if str(i[1]) != '_':
        k = str(i[:2])  # Get first two characters of image name and use as 'key'
    else:
        k = str(i[:1])  # Get first character of image name and use 'key'
    # Intiates a list for each key and allows storing multiple entries
    data_dict.setdefault(k, [])
    data_dict[k].append(pytesseract.image_to_string(i))

代码按预期执行。
图像名称中的数字范围可以从1到99。
可以将其减少为dictionary comprehension吗？

Answer 1

不。 dict理解中的每个迭代都会为键分配一个值；它无法更新现有值列表。 Dict的理解并不总是更好-您编写的代码似乎足够好。虽然也许你可以写

data_dict = {}

for i in os.listdir(images):
    k = i.partition("_")[0]
    image_string = pytesseract.image_to_string(i)
    data_dict.setdefault(k, []).append(image_string)

Answer 2

是的。这是一种方法，但我不建议这样做：

{k: d.setdefault(k, []).append(pytesseract.image_to_string(i)) or d[k]
 for d in [{}]
 for k, i in ((i.split('_')[0], i) for i in names)}

那也许尽我所能，但仍然很糟糕。最好使用普通循环，尤其是像Dennis那样干净的循环。

轻微的变化（如果我一次滥用，我也可能会两次）：

{k: d.setdefault(k, []).append(pytesseract_image_to_string(i)) or d[k]
 for d in [{}]
 for i in names
 for k in i.split('_')[:1]}

编辑：kaya3现在使用dict理解发布了一个 good 。我也建议在我的上面。我的确只是我像“有人说这无法完成？接受挑战！”之类的肮脏结果。。

Answer 3

在这种情况下，itertools.groupby很有用；您可以按数字部分对文件名进行分组。但是使它工作并不容易，因为这些组必须按顺序连续。

这意味着在使用$ServerList = import-csv $Servercsv foreach ($ComputerName in $ServerList) { $Connection = Test-Connection -ComputerName $ComputerName.servers -Count 1 -Quiet if ($Connection) { Log("'" + $ComputerName.servers + "' Successfully started operation") Write-Host "Started Operation Successfully" } Else { Log("'" + $ComputerName.servers + "'Servers are not pinging") Write-Host "Servers are not pinging" } }之前，需要使用提取数字部分的键函数进行排序。这是我们要分组的相同键功能，因此有必要单独编写键功能。

groupby

将from itertools import groupby def image_key(image): return str(image).partition('_')[0] images = ['1_foo.png', '2_foo.png', '3_bar.png', '1_baz.png'] result = { k: list(v) for k, v in groupby(sorted(images, key=image_key), key=image_key) } # {'1': ['1_foo.png', '1_baz.png'], # '2': ['2_foo.png'], # '3': ['3_bar.png']}用list(v)替换为您的用例。

将代码简化为字典理解

3 个答案: