Question

示例：

 def test_get_word_len_dict():
    text = "May your coffee be strong and your Monday be short"
    the_dict = get_word_len_dict(text)
    print_dict_in_key_order(the_dict)
    print()

    text = 'why does someone believe you when you say there are four billion stars but they have to check when you say the paint is wet'
    the_dict = get_word_len_dict(text)
    print_dict_in_key_order(the_dict)

我想要一本带有键的字典整数和相应的值，它们是唯一单词的列表与键对应的单词列表包含来自的所有唯一单词长度等于键值的文本。相应的唯一单词列表应按字母顺序排序。我不知道如何修复我的功能

def get_len_word_dict():
     new_text = text.split()
     for k,v in new_text.items():
        if len(v) = len(k):
           return k,v

预期：

 2 : ['be']
 3 : ['May', 'and']
 4 : ['your']
 5 : ['short']
 6 : ['Monday', 'coffee', 'strong']

 2 : ['is', 'to']
 3 : ['are', 'but', 'say', 'the', 'wet', 'why', 'you']
 4 : ['does', 'four', 'have', 'they', 'when']
 5 : ['check', 'paint', 'stars', 'there']
 7 : ['believe', 'billion', 'someone']

Answer 1

defaultdict set应该这样做。首先，让我们定义一个函数。

from collections import defaultdict

def get_word_counts(text):
    d = defaultdict(set)

    for word in text.split():
        d[len(word)].add(word)    # observe this bit carefully

    return {k : list(v) for k, v in d.items()}

想法是找到每个单词的长度，并将其插入到它所属的列表/集合中。一旦定义了该功能，您可以随意调用它。

text = "May your coffee be strong and your Monday be short"
print(get_word_counts(text))
{2: ['be'], 3: ['and', 'May'], 4: ['your'], 5: ['short'], 6: ['coffee', 'strong', 'Monday']}

Answer 2

您还可以将itertools.groupby与sorted一起使用以获得类似的“懒惰”＃34;结果：

a = 'a long list of words in nice'
x = groupby(sorted(a.split(), key=len), len)  # word counts
print(dict((a, list(b)) for a, b in x))
>>> {1: ['a'], 2: ['of', 'in'], 4: ['long', 'list', 'nice'], 5: ['words']}

懒惰＆＃34;我的意思是事情不会开始实际计算（例如，如果你有一个非常大的字符串），直到你开始迭代它。请注意groupby()返回的迭代器！这很容易意外地清空它们，然后尝试第二次读取（并获得空列表）。

返回的组本身就是一个共享底层的迭代器可以使用groupby（）进行迭代。因为源是共享的，所以 groupby（）对象是高级的，前一个组不再可见。因此，如果以后需要该数据，则应将其存储为列表

在dict python中获取单词长度

2 个答案: