如何标记是否缺少字典键

时间:2019-06-26 18:47:43

标签: python

我有一些使用CommonChar的文件,并且我的python代码可以在它们上构建字典。在构建时,有一些必需的密钥,用户可能会忘记插入。代码应该能够标记文件和丢失的密钥。

适用于python代码的语法如下:

CommonChar pins Category General
CommonChar pins Contact Mark
CommonChar pins Description 1st line 
CommonChar pins Description 2nd line 
CommonChar nails Category specific
CommonChar nails Description 1st line

因此,对于上面的示例,缺少“联系人”: CommonChar钉子联系Robert

我有一个清单,例如:mustNeededKeys = [“类别”,“描述”,“联系人”]

mainDict={}
for dirName, subdirList, fileList in os.walk(sys.argv[1]):
    for eachFile in fileList:
        #excluding file names ending in .swp , swo which are creatied temporarily when editing in vim
        if not eachFile.endswith(('.swp','.swo','~')):
            #print eachFile
            filePath= os.path.join(dirName,eachFile)
        #print filePath
            with open(filePath, "r") as fh:
                contents=fh.read()
            items=re.findall("CommonChar.*$",contents,re.MULTILINE)
            for x in items:
        cc, group, topic, data = x.split(None, 3)
                data = data.split()
                group_dict = mainDict.setdefault(group, {'fileLocation': [filePath]})                        
                if topic in group_dict:                 
                    group_dict[topic].extend(['</br>'] + data)
                else:
                    group_dict[topic] = data

上面的代码完成了构建这样的字典的工作:

{'pins': {'Category': ['General'], 'Contact': ['Mark'], 'Description': ['1st', 'line', '2nd', 'line'] } , 'nails':{'Category':['specific'], 'Description':['1st line']} 

因此,当使用CommonChar读取每个文件并构建group_dict时,这是一种检查所有密钥并将其与mustNeededKeys进行比较并标记(如果不存在则标记)并在满足时继续进行的方法。

2 个答案:

答案 0 :(得分:1)

类似的事情应该起作用:

# Setup mainDict (equivalent to code given above)
mainDict = {
    'nails': {
        'Category': ['specific'],
        'Description': ['1st', 'line'],
        'fileLocation': ['/some/path/nails.txt']
    },
    'pins': {
        'Category': ['General'],
        'Contact': ['Mark'],
        'Description': ['1st', 'line', '</br>', '2nd', 'line'],
        'fileLocation': ['/some/path/pins.txt']
    }
}

# check for missing keys
mustNeededKeys = {"Category", "Description", "Contact"}
for group, group_dict in mainDict.items():
    missing_keys = mustNeededKeys - set(group_dict.keys())
    if missing_keys:
        missing_key_list = ','.join(missing_keys)
        print(
            'group "{}" ({}) is missing key(s): {}'
            .format(group, group_dict['fileLocation'][0], missing_key_list)
        )
# group "nails" (/some/path/nails.txt) is missing key(s): Contact

如果在处理每个组后必须立即检查丢失的键,则可以使用下面的代码。假设每个组都作为一个连续的行集合存储在单个文件中(即,不与同一文件中的其他组混合或分布在不同文件中)。

from itertools import groupby

mainDict={}
mustNeededKeys = {"Category", "Description", "Contact"}
for dirName, subdirList, fileList in os.walk(sys.argv[1]):
    for eachFile in fileList:
        # excluding file names ending in .swp , swo which are created 
        # temporarily when editing in vim
        if not eachFile.endswith(('.swp','.swo','~')):
            #print eachFile
            filePath = os.path.join(dirName,eachFile)
            #print filePath
            with open(filePath, "r") as fh:
                contents = fh.read()
            items = re.findall("CommonChar.*$", contents, re.MULTILINE)
            split_items = [line.split(None, 3) for line in items]
            # group the items by group name (element 1 in each row)
            for g, group_items in groupby(split_items, lambda row: row[1]):
                group_dict = {'fileLocation': [filePath]}
                # store all items in the current group
                for cc, group, topic, data in group_items:
                    data = data.split()
                    if topic in group_dict:
                        group_dict[topic].extend(['</br>'] + data)
                    else:
                        group_dict[topic] = data
                # check for missing keys
                missing_keys = mustNeededKeys - set(group_dict.keys())
                if missing_keys:
                    missing_key_list = ','.join(missing_keys)
                    print(
                        'group "{}" ({}) is missing key(s): {}'
                        .format(group, filePath, missing_key_list)
                    )
                # add group to mainDict
                mainDict[group] = group_dict

答案 1 :(得分:0)

data = '''CommonChar pins Category General
CommonChar pins Contact Mark
CommonChar pins Description 1st line
CommonChar pins Description 2nd line
CommonChar nails Category specific
CommonChar nails Description 1st line'''

from collections import defaultdict
from pprint import pprint

required_keys = ["Category", "Description", "Contact"]

d = defaultdict(dict)
for line in data.splitlines():
    line = line.split()
    if line[2] == 'Description':
        if line[2] not in d[line[1]]:
            d[line[1]][line[2]] = []
        d[line[1]][line[2]].extend(line[3:])
    else:
        d[line[1]][line[2]] = [line[3]]

pprint(dict(d))
print('*' * 80)

# find missing keys
for k in d.keys():
    for missing_key in set(d[k].keys()) ^ set(required_keys):
        print('Key "{}" is missing "{}"!'.format(k, missing_key))

打印:

{'nails': {'Category': ['specific'], 'Description': ['1st', 'line']},
 'pins': {'Category': ['General'],
          'Contact': ['Mark'],
          'Description': ['1st', 'line', '2nd', 'line']}}
********************************************************************************
Key "nails" is missing "Contact"!