Question

    sentence="one fish two fish red fish blue fish one 
    red two blue"
    sentence='start '+sentence+' end'
    word_list=sentence.split(' ')
    d={}
    for i in range(len(word_list)-1):
        d[word_list[i]]=word_list[i+1]

    print word_list
    print d

因此，我得到了 word_list ：

    ['start', 'one', 'fish', 'two', 'fish', 'red',\ 
    'fish', 'blue', 'fish', 'one', 'red', 'two',\ 
    'blue', 'end']

和 d ：

    {'blue': 'end', 'fish': 'one', 'two': 'blue',\
     'one': 'red', 'start': 'one', 'red': 'two'}

但是我需要一个字典，其值看起来像在关键字之后的每个可能单词的列表。例如，单词“ fish ”后跟4个单词，因此我需要：

    'fish':['two', 'red', 'blue', 'one']

' blue '之后是' fish '和' end '

    'blue':['one', 'end']

等

请问有什么想法吗？

任务是生成随机句子的第一步。

谢谢））

Answer 1

您可以尝试以下方法：

from collections import defaultdict

sentence="one fish two fish red fish blue fish one red two blue"
word_list = sentence.split()

d = defaultdict(list)
for a, b in zip( word_list, word_list[1:]) :
    d[a].append(b)

print d

它给出：

{
    "blue": [ "fish" ], 
    "fish": [ "two", "red", "blue", "one" ], 
    "two": [ "fish", "blue" ], 
    "red": [ "fish", "two" ], 
    "one": [ "fish", "red" ]
}

，您无需添加start和end即可避免访问超出列表大小的元素。

生成字典值，如列表

1 个答案: