Question

我有一个像

这样的反物体

Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})

我希望按现有值按递增顺序重新分配每个元素的值，例如

Counter({'the': 0, 'to': 1, 'of': 2, 'independence': 3, 'puigdemont': 4, 'mr': 5, 'a': 6, 'spain': 7, 'for': 8})

有没有可能的方法？

提前致谢。

更新

（我的英语不是很好，所以你可以跳过我的解释并滚动查看下面的例子。）对不起，似乎我没有说清楚我的问题。实际上整个Counter对象要长得多。该对象从段落中获取，每个单词的值是该段落中的出现。我想构建一个字典，用字典中的相应值替换段落中的单词。字典中的值按照段落中单词的频率排序，如果两个单词具有相同的出现次数，则按字母顺序排列。

示例：

string =“哪里有烟有火” 字符串中每个单词的出现次数： where = 1，= 2，is = 2，smoke = 1，fire = 1。所以我需要一本字典：

{“is”: 0, “there”: 1, ”fire”:2 , “smoke”: 3, “where”:4}

最常用的单词是“is”和“there”，但按字母顺序排列，“i”位于“t”前面，因此“is”为0，“there”为1。

有没有什么好方法可以做到这一点？

非常感谢!!

Answer 1

您需要OrderedDict：

from collections import Counter, OrderedDict

data_dict = OrderedDict({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})
c1 = Counter(dict(zip(data_dict.keys(), range(len(data_dict)))))
print(c2)

输出：

Counter({'for': 8, 'spain': 7, 'a': 6, 'mr': 5, 'puigdemont': 4, 'independence': 3, 'of': 2, 'to': 1, 'the': 0})

这里有live example

Answer 2

访问每个密钥并更改其值：

from collections import Counter

a_dict = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})

n = 0
for d in a_dict:    
    a_dict[d] = n
    n += 1

>>> a_dict
Counter({'for': 8, 'spain': 7, 'a': 6, 'mr': 5, 'puigdemont': 4, 'independence': 3, 'of': 2, 'to': 1, 'the': 0})

如果你可以使用有序的元组列表：

>>> sorted(a_dict.items(), key=lambda x: x[1])
[('the', 0), ('to', 1), ('of', 2), ('independence', 3), ('puigdemont', 4), ('mr', 5), ('a', 6), ('spain', 7), ('for', 8)]

Answer 3

据我的评论了解，您不需要排序计数器，所以

c = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})

for i, k in enumerate(c.most_common()):
    c[k[0]] = i

结果：

Counter({'spain': 8, 'for': 7, 'a': 6, 'puigdemont': 5, 'independence': 4, 'mr': 3, 'of': 2, 'the': 1, 'to': 0})

<强>更新

m = c.most_common()
res = {k[0]: i for i, k in enumerate(sorted(m, key=lambda x: (-x[1], x[0])))}

结果：

{'a': 6, 'spain': 8, 'of': 4, 'mr': 3, 'the': 0, 'for': 7, 'to': 1, 'independence': 2, 'puigdemont': 5}

Answer 4

按频率和字母顺序对单词进行排序，然后根据为每个单词分配唯一键的字词创建字典：

from collections import Counter

c = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})
res = {word: unique_id for unique_id, (_, word) in enumerate(
    sorted([(-freq, word) for word, freq in c.most_common()]))
}

print(res)

输出：

{'the': 0, 'to': 1, 'independence': 2, 'mr': 3, 'of': 4, 'puigdemont': 5, 'a': 6, 'for': 7, 'spain': 8}

请注意，结果是dict，因此不一定是有序的。（在cpython 3.6中它会被排序，但这是一个不应该依赖的实现细节。）

最里面的理解用于创建（-freq，word）的元组，它将产生所需的排序顺序。外部理解会丢弃频率（解包键值并仅保留单词）并使用枚举来生成唯一ID

编辑：如果输出中需要订单，请改为使用：

from collections import Counter, OrderedDict

c = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})
res = OrderedDict((word, unique_id) for unique_id, (_, word) in enumerate(
    sorted([(-freq, word) for word, freq in c.most_common()]))
)

print(res)

如何通过现有值重新分配字典的值？

4 个答案: