将句子中的单词长度映射到单词列表

时间:2018-04-18 14:07:26

标签: python string python-3.x list dictionary

给出的指示要求返回字符串中每个单词长度的代码。所以它会计算每个单词中的字母数量并将其打印在单词旁边 我有这段代码:

def word_lengths(a):
    a = a.lower()
    c = list(a)
    a = ""
    for x in c:
        if x == "," or x == "." or x == "'" or x == "!" or x == "?":
            c[c.index(x)] = ""
    for x in c:
        a += x
    y = a.split()
    z = {}
    for x in y:
        z[x] = len(x)
    return z
print(word_lengths("I ate a bowl of cereal out of a dog bowl today."))

返回:

{'dog': 3, 'bowl': 4, 'a': 1, 'out': 3, 'of': 2, 'ate': 3, 'cereal': 6, 'i': 1, 'today': 5}

5 个答案:

答案 0 :(得分:3)

您可以将collections.defaultdict用于O(n)解决方案:

from collections import defaultdict
from string import punctuation

def word_lengths(x):
    table = str.maketrans(punctuation, ' ' * len(punctuation))
    # alternatively, table = str.maketrans({key: None for key in punctuation})
    x = x.translate(table).lower()
    d = defaultdict(list)
    for word in x.split():
        d[len(word)].append(word)
    return d

res = word_lengths("I ate a bowl of cereal out of a dog bowl today.")

# defaultdict(list,
#             {1: ['i', 'a', 'a'],
#              2: ['of', 'of'],
#              3: ['ate', 'out', 'dog'],
#              4: ['bowl', 'bowl'],
#              5: ['today'],
#              6: ['cereal']})

<强>解释

  • 首先删除标点符号(根据@Patrick's solution)并将字符串设为小写。
  • 初始化defaultdict个列表。
  • 按空格拆分列表,迭代单词并将元素附加到字典列表值。

答案 1 :(得分:2)

使用简单的迭代

<强>演示:

def word_lengths(s):
    d = {}
    for i in s.split():           #Split by space
        l = len(i)
        if l not in d:            #Create len as key
            d[l] = [i]
        else:
            d[l].append(i)  
    return d


print(word_lengths("I ate a bowl of cereal out of a dog bowl today."))

<强>输出:

{1: ['I', 'a', 'a'], 2: ['of', 'of'], 3: ['ate', 'out', 'dog'], 4: ['bowl', 'bowl'], 6: ['cereal', 'today.']}

答案 2 :(得分:1)

这是一个使用str.translate

处理标点符号的版本
def word_lengths(s, remove='.,!?'):
    trans=str.maketrans('', '', remove)
    s = s.lower().translate(trans)
    d = defaultdict(list)
    for word in s.split():
        d[len(word)].append(word)
    return dict(d)  # Probably unnecessary and return d would work

word_lengths("I ate a bowl of cereal out of a dog bowl today.")

给我们

{1: ['i', 'a', 'a'],
 2: ['of', 'of'],
 3: ['ate', 'out', 'dog'],
 4: ['bowl', 'bowl'],
 5: ['today'],
 6: ['cereal']}

答案 3 :(得分:0)

您可以使用defaultdict执行此操作,collections是标准库from collections import defaultdict import re def word_lengths(text): d = defaultdict(list) for word in re.findall(r'\w+', text.lower()): d[len(word)].append(word) return d 模块中众多有用的数据结构之一。

re.findall

我们使用this.form.get("formControleName"); 仅匹配单词,没有空格和标点符号。如果要将连字符和撇号包含为单词字符,可以调整正则表达式。

答案 4 :(得分:0)

可以简单地循环值以生成字典。

In [1]: c = defaultdict(list)

In [2]: for word in "I ate a bowl of cereal out of a dog bowl today.".split(' '):
...:     c[len(word)].append(word)
...:     

In [3]: c
Out[4]: 
defaultdict(list,
            {1: ['I', 'a', 'a'],
             2: ['of', 'of'],
             3: ['ate', 'out', 'dog'],
             4: ['bowl', 'bowl'],
             6: ['cereal', 'today.']})