我有一个重复值的词典。
Deca_dict = {
"1": "2_506",
"2": "2_506",
"3": "2_506",
"4": "2_600",
"5": "2_600",
"6": "1_650"
}
我使用了collections.Counter来计算每个中有多少。
decaAdd_occurrences = {'2_506':3, '2_600':2, '1_650':1}
然后我创建了一个要更新的新值字典。
deca_double_dict = {key: value for key, value in Deca_dict.items()
if decaAdd_occurrences[value] > 1}
deca_double_dict = {
"1": "2_506",
"3": "2_506",
"2": "2_506",
"4": "2_600"
}
(在这种情况下,它是没有最后一项的原始词典)
我试图弄清楚如何增加num,使counter_dict的值减去1.这将更新除了一个之外的所有值,它们可以保持不变。 目标输出允许其中一个副本保持相同的值,而其余的将使值字符串的第一个数字逐渐增加(基于重复计数的数量)。我试图为原始Deca_dict代表的数据实现唯一值。
Goal output = {'1':'3_506', '2':'4_506', '3':'2_506', '4':'3_600', '5':'2_600'}
我开始按照以下方式处理事情,但最终只是递增所有双项,导致我原来的,除了值加一。 对于上下文:发现原始Deca_dict的值连接两个数字(deca_address_num和deca_num_route)。此外,homesLayer是一个QGIS矢量图层,其中deca_address_num和deca_num_route存储在索引为d_address_idx和id_route_idx的字段中。
for key in deca_double_dict.keys():
for home in homesLayer.getFeatures():
if home.id() == key:
deca_address_num = home.attributes()[d_address_idx]
deca_num_route = home.attributes()[id_route_idx]
deca_address_plus = deca_address_num + increment
next_deca_address = (str(deca_address_plus) + '_' +
str(deca_num_route))
if not next_deca_address in Deca_dict.values():
update_deca_dbl_dict[key] = next_deca_address
结果没用:
Update_deca_dbl_dict = {
"1": "3_506",
"3": "3_506",
"2": "3_506",
"5": "3_600",
"4": "3_600"
}
我的第二次尝试尝试包括一个计数器,但事情是在错误的地方。
for key, value in deca_double_dict.iteritems():
iterations = decaAdd_occurrences[value] - 1
for home in homesLayer.getFeatures():
if home.id() == key:
#deca_homeID_list.append(home.id())
increment = 1
deca_address_num = home.attributes()[d_address_idx]
deca_num_route = home.attributes()[id_route_idx]
deca_address_plus = deca_address_num + increment
next_deca_address = (str(deca_address_plus) + '_' +
str(deca_num_route))
#print deca_num_route
while iterations > 0:
if not next_deca_address in Deca_dict.values():
update_deca_dbl_dict[key] = next_deca_address
iterations -= 1
increment += 1
更新即使下面的其中一个答案适用于递增我的字典的所有重复项目,我也会尝试重新处理我的代码,因为我需要将此比较条件与原始数据相比较为了增加。我仍然有与第一次尝试(无用的)相同的结果。
for key, value in deca_double_dict.iteritems():
for home in homesLayer.getFeatures():
if home.id() == key:
iterations = decaAdd_occurrences[value] - 1
increment = 1
while iterations > 0:
deca_address_num = home.attributes()[d_address_idx]
deca_num_route = home.attributes()[id_route_idx]
deca_address_plus = deca_address_num + increment
current_address = str(deca_address_num) + '_' + str(deca_num_route)
next_deca_address = (str(deca_address_plus) + '_' +
str(deca_num_route))
if not next_deca_address in Deca_dict.values():
update_deca_dbl_dict[key] = next_deca_address
iterations -= 1
increment += 1
else:
alpha_deca_dbl_dict[key] = current_address
iterations = 0
答案 0 :(得分:1)
这大概是你想要的吗?我假设您可以处理将2_506更改为3_506等的功能。而不是您的计数器,我使用一组来确保没有重复的值。
在原帖中,我在底部剪了一条线,抱歉。
values_so_far = set()
d1 = {} # ---your original dictionary with duplicate values---
d2 = {} # d1 with all the duplicates changed
def increment_value(old_value):
# you know how to write this
# return the modified string
for k,v in d1.items():
while v in values_so_far:
v = increment_value(v)
d2[k] = v
values_so_far.add(v)
答案 1 :(得分:1)
这是一个解决方案: 从本质上讲,它保留了第一个重复值,并在其余重复项上增加了前置数字。
from collections import OrderedDict, defaultdict
orig_d = {'1':'2_506', '2':'2_506', '3':'2_506', '4':'2_600', '5':'2_600'}
orig_d = OrderedDict(sorted(orig_d.items(), key=lambda x: x[0]))
counter = defaultdict(int)
for k, v in orig_d.items():
counter[v] += 1
if counter[v] > 1:
pre, post = v.split('_')
pre = int(pre) + (counter[v] - 1)
orig_d[k] = "%s_%s" % (pre, post)
print(orig_d)
结果:
OrderedDict([('1', '2_506'), ('2', '3_506'), ('3', '4_506'), ('4', '2_600'), ('5', '3_600')])
答案 2 :(得分:1)
我认为这可以满足您的需求。我稍微修改了你的输入字典,以更好地说明发生了什么与您所做的主要区别在于decaAdd_occurrences
,它是从Counter
字典创建的,不仅跟踪计数,还跟踪当前地址num
前缀的值。这使得可以知道要使用的下一个num
值是什么,因为在修改Deca_dict
的过程中它和计数都会更新。
from collections import Counter
Deca_dict = {
"1": "2_506",
"2": "2_506",
"3": "2_506",
"4": "2_600",
"5": "1_650",
"6": "2_600"
}
decaAdd_occurrences = {k: (int(k.split('_')[0]), v) for k,v in
Counter(Deca_dict.values()).items()}
for key, value in Deca_dict.items():
num, cnt = decaAdd_occurrences[value]
if cnt > 1:
route = value.split('_')[1]
next_num = num + 1
Deca_dict[key] = '{}_{}'.format(next_num, route)
decaAdd_occurrences[value] = next_num, cnt-1 # update values
更新字典:
Deca_dict -> {
"1": "3_506",
"2": "2_506",
"3": "4_506",
"4": "3_600",
"5": "1_650",
"6": "2_600"
}