我有以下字符串:
>>>sentence='No, I shouldn't be glad, YOU should be glad.'
我想要的是制作一个字典,其中一个单词的句子作为键,下一个单词作为值。
>>>dict(sentence)
{('No,'): ['I'], ('I'): ['shouldn't'], ('shouldn't'): ['be'], ('be'): ['glad,', 'glad.'], ('glad,'): ['YOU'], ('YOU'): ['should'], ('should'): ['be']}
^ ^ ^
| | |
正如您可以看到一个单词在一个句子中出现多次,它会得到多个值。如果它是最后一个单词,它将不会被添加到字典中。 'glad'没有得到多个值,因为单词以','或'。'结尾。这使它成为一个不同的字符串。
答案 0 :(得分:4)
import collections
sentence = "No, I shouldn't be glad, YOU should be glad."
d = collections.defaultdict(list)
words = sentence.split()
for k, v in zip(words[:-1], words[1:]):
d[k].append(v)
print(d)
这会产生
defaultdict(<type 'list'>, {'No,': ['I'], 'be': ['glad,', 'glad.'], 'glad,': ['YOU'], 'I': ["shouldn't"], 'should': ['be'], "shouldn't": ['be'], 'YOU': ['should']})
答案 1 :(得分:3)
In [9]: strs = "No, I shouldn't be glad, YOU should be glad."
In [19]: dic = {}
In [20]: for x, y in zip(words, words[1:]):
dic.setdefault(x, []).append(y)
....:
In [21]: dic
Out[21]:
{'I': ["shouldn't"],
'No,': ['I'],
'YOU': ['should'],
'be': ['glad,', 'glad.'],
'glad,': ['YOU'],
'should': ['be'],
"shouldn't": ['be']}
答案 2 :(得分:0)
这是未经测试但应该接近。
words = sentence.split()
sentenceDict = {}
for index in xrange(len(words)-1):
if words[index] in sentenceDict:
sentenceDict[words[index].append(words[index+1])
else
sentenceDict[words[index]] = [words[index+1]]
答案 3 :(得分:0)
如果订单不重要,只需采用另一种方式
sentence="No, I shouldn't be glad, YOU should be glad."
#Split the string into words
sentence = sentence.split()
#Create pairs of consecutive words
sentence = zip(sentence,sentence[1:])
from itertools import groupby
from operator import itemgetter
#group the sorted pairs based on the key
sentence = groupby(sorted(sentence, key = itemgetter(0)), key = itemgetter(0))
#finally create a dictionary of the groups
{k:[v for _,v in g] for k, g in sentence}
{'No,': ['I'], 'be': ['glad,', 'glad.'], 'glad,': ['YOU'], 'I': ["shouldn't"], 'should': ['be'], "shouldn't": ['be'], 'YOU': ['should']}
答案 4 :(得分:0)
import collections
sentence = "No, I shouldn't be glad, YOU should be glad."
d = collections.defaultdict(list)
words = sentence.split()
for k, v in zip(words[:-1], words[1:]):
d[k].append(v)
print(d)
这会产生
defaultdict(<type 'list'>, {'No,': ['I'], 'be': ['glad,', 'glad.'], 'glad,': ['YOU'], 'I': ["shouldn't"], 'should': ['be'], "shouldn't": ['be'], 'YOU': ['should']})
@NLS:我只想在此添加一些内容。 “d = collections.defaultdict(list)”,就像dict对象不保留单词的顺序所以如果我们必须保留句子的顺序,我们可能必须使用元组。