Question

我有一个重复值的词典。

Deca_dict = {
    "1": "2_506",
    "2": "2_506",
    "3": "2_506",
    "4": "2_600",
    "5": "2_600",
    "6": "1_650"
}

我使用了collections.Counter来计算每个中有多少。

decaAdd_occurrences = {'2_506':3, '2_600':2, '1_650':1}

然后我创建了一个要更新的新值字典。

deca_double_dict = {key: value for key, value in Deca_dict.items()
                        if decaAdd_occurrences[value] > 1}
deca_double_dict = {
    "1": "2_506",
    "3": "2_506",
    "2": "2_506",
    "4": "2_600"
}

（在这种情况下，它是没有最后一项的原始词典）

我试图弄清楚如何增加num，使counter_dict的值减去1.这将更新除了一个之外的所有值，它们可以保持不变。目标输出允许其中一个副本保持相同的值，而其余的将使值字符串的第一个数字逐渐增加（基于重复计数的数量）。我试图为原始Deca_dict代表的数据实现唯一值。

Goal output = {'1':'3_506', '2':'4_506', '3':'2_506', '4':'3_600', '5':'2_600'}

我开始按照以下方式处理事情，但最终只是递增所有双项，导致我原来的，除了值加一。对于上下文：发现原始Deca_dict的值连接两个数字（deca_address_num和deca_num_route）。此外，homesLayer是一个QGIS矢量图层，其中deca_address_num和deca_num_route存储在索引为d_address_idx和id_route_idx的字段中。

for key in deca_double_dict.keys():
    for home in homesLayer.getFeatures():
        if home.id() == key:
            deca_address_num = home.attributes()[d_address_idx]
            deca_num_route = home.attributes()[id_route_idx]
            deca_address_plus = deca_address_num + increment
            next_deca_address = (str(deca_address_plus) + '_' +
                                 str(deca_num_route))
            if not next_deca_address in Deca_dict.values():
                update_deca_dbl_dict[key] = next_deca_address

结果没用：

Update_deca_dbl_dict = {
    "1": "3_506",
    "3": "3_506",
    "2": "3_506",
    "5": "3_600",
    "4": "3_600"
}

我的第二次尝试尝试包括一个计数器，但事情是在错误的地方。

for key, value in deca_double_dict.iteritems():
    iterations = decaAdd_occurrences[value] - 1
    for home in homesLayer.getFeatures():
        if home.id() == key:
            #deca_homeID_list.append(home.id())
            increment = 1
            deca_address_num = home.attributes()[d_address_idx]
            deca_num_route = home.attributes()[id_route_idx]
            deca_address_plus = deca_address_num + increment
            next_deca_address = (str(deca_address_plus) + '_' +
                                 str(deca_num_route))
            #print deca_num_route
            while iterations > 0:
                if not next_deca_address in Deca_dict.values():
                    update_deca_dbl_dict[key] = next_deca_address
                    iterations -= 1
                    increment += 1

更新即使下面的其中一个答案适用于递增我的字典的所有重复项目，我也会尝试重新处理我的代码，因为我需要将此比较条件与原始数据相比较为了增加。我仍然有与第一次尝试（无用的）相同的结果。

for key, value in deca_double_dict.iteritems():
    for home in homesLayer.getFeatures():
        if home.id() == key:
            iterations = decaAdd_occurrences[value] - 1
            increment = 1
            while iterations > 0:
                deca_address_num = home.attributes()[d_address_idx]
                deca_num_route = home.attributes()[id_route_idx]
                deca_address_plus = deca_address_num + increment
                current_address = str(deca_address_num) + '_' + str(deca_num_route)
                next_deca_address = (str(deca_address_plus) + '_' +
                                 str(deca_num_route))
                if not next_deca_address in Deca_dict.values():
                    update_deca_dbl_dict[key] = next_deca_address
                    iterations -= 1
                    increment += 1
                else:
                    alpha_deca_dbl_dict[key] = current_address
                    iterations = 0

Answer 1

这大概是你想要的吗？我假设您可以处理将2_506更改为3_506等的功能。而不是您的计数器，我使用一组来确保没有重复的值。

在原帖中，我在底部剪了一条线，抱歉。

values_so_far = set()
d1 = {} # ---your original dictionary with duplicate values---
d2 = {} # d1 with all the duplicates changed
def increment_value(old_value):
    # you know how to write this
    # return the modified string

for k,v in d1.items():
    while v in values_so_far:
        v = increment_value(v)
    d2[k] = v
    values_so_far.add(v)

Answer 2

这是一个解决方案：从本质上讲，它保留了第一个重复值，并在其余重复项上增加了前置数字。

from collections import OrderedDict, defaultdict
orig_d = {'1':'2_506', '2':'2_506', '3':'2_506', '4':'2_600', '5':'2_600'}
orig_d = OrderedDict(sorted(orig_d.items(), key=lambda x: x[0]))

counter = defaultdict(int)
for k, v in orig_d.items():
    counter[v] += 1
    if counter[v] > 1:
        pre, post = v.split('_')
        pre = int(pre) + (counter[v] - 1)
        orig_d[k] = "%s_%s" % (pre, post)

print(orig_d)

结果：

OrderedDict([('1', '2_506'), ('2', '3_506'), ('3', '4_506'), ('4', '2_600'), ('5', '3_600')])

Answer 3

我认为这可以满足您的需求。我稍微修改了你的输入字典，以更好地说明发生了什么与您所做的主要区别在于decaAdd_occurrences，它是从Counter字典创建的，不仅跟踪计数，还跟踪当前地址num前缀的值。这使得可以知道要使用的下一个num值是什么，因为在修改Deca_dict的过程中它和计数都会更新。

from collections import Counter

Deca_dict = {
    "1": "2_506",
    "2": "2_506",
    "3": "2_506",
    "4": "2_600",
    "5": "1_650",
    "6": "2_600"
}

decaAdd_occurrences = {k: (int(k.split('_')[0]), v) for k,v in
                                Counter(Deca_dict.values()).items()}

for key, value in Deca_dict.items():
    num, cnt = decaAdd_occurrences[value]
    if cnt > 1:
        route = value.split('_')[1]
        next_num = num + 1
        Deca_dict[key] = '{}_{}'.format(next_num, route)
        decaAdd_occurrences[value] = next_num, cnt-1  # update values

更新字典：

Deca_dict -> {
    "1": "3_506",
    "2": "2_506",
    "3": "4_506",
    "4": "3_600",
    "5": "1_650",
    "6": "2_600"
}

基于计数器增加Python字典值

3 个答案: