Question

我是python的新手！我创建了一个代码，它可以成功打开我的文本文件并对我的100个单词列表进行排序。然后我将这些列在标有stimuli_words的列表中，该列表中没有重复的单词，全部是小写等。

但是我现在想把这个列表转换成一个字典，其中的键在我的单词列表中都是3个字母的结尾，而值是与这些结尾相对应的单词。

例如：去，招聘......＆＃39;，但我只想要其中有超过40个单词对应最后两个字符的单词。到目前为止，我有这段代码：

from collections import defaultdict
fq = defaultdict( int )
for w in stimuli_list:
    fq[w] += 1
print fq

然而，它只是用我的单词返回一本字典，以及它们发生了多少次，这显然是一次。例如，＆＃39;：1，＆＃39;招聘＆＃39;：1，＆＃39;驾驶＆＃39;：1。

真的很感激一些帮助!!谢谢!!

Answer 1

你可以这样做：

dictionary = {}
words = ['going', 'hiring', 'driving', 'letter', 'better', ...] # your list or words

# Creating words dictionary
for word in words:
    dictionary.setdefault(word[-3:], []).append(word)

# Removing lists that contain less than 40 words:
for key, value in dictionary.copy().items():
    if len(value) < 40:
        del dictionary[key]

print(dictionary)

输出：

{ # Only lists that are longer than 40 words
    'ing': ['going', 'hiring', 'driving', ...],
    'ter': ['letter', 'better', ...],
    ...
}

Answer 2

由于你在计算单词（因为你的单词是单词），每个单词只能获得1个计数。

您可以创建最后3个字符的键（并使用Counter代替）：

import collections

wordlist = ["driving","hunting","fishing","drive","a"]

endings = collections.Counter(x[-3:] for x in wordlist)

print(endings)

结果：

Counter({'ing': 3, 'a': 1, 'ive': 1})

Answer 3

创建DemoData：

import random 

# seed the same for any run
random.seed(10)

# base lists for demo data
prae = ["help","read","muck","truck","sleep"]
post= ["ing", "biothign", "press"]

# lots of data
parts = [ x+str(y)+z for x in prae for z in post for y in range(100,1000,100)]

# shuffle and take on ever 15th
random.shuffle(parts) 
stimuli_list = parts[::120]

从stimuli_list

创建字典

# create key with empty lists
dic = dict(("".join(e[len(e)-3:]),[]) for e in stimuli_list)

# process data and if fitting, fill list
for d in dic:
    fitting = [x for x in parts if x.endswith(d)]   # adapt to only fit 2 last chars

    if len(fitting) > 5:                            # adapt this to have at least n in it
        dic[d] = fitting[:]

for d in [x for x in dic if not dic[x]]: # remove keys with empty lists
    dic.remove(d)

print()    
print(dic)

输出：

{'ess': ['help400press', 'sleep100press', 'sleep600press', 'help100press', 'muck400press', 'muck900press', 'muck500press', 'help800press', 'muck100press', 'read300press', 'sleep400press', 'muck800press', 'read600press', 'help200press', 'truck600press', 'truck300press', 'read700press', 'help900press', 'truck400press', 'sleep200press', 'read500press', 'help600press', 'truck900press', 'truck800press', 'muck200press', 'truck100press', 'sleep700press', 'sleep500press', 'sleep900press', 'truck200press', 'help700press', 'muck300press', 'sleep800press', 'muck700press', 'sleep300press', 'help500press', 'truck700press', 'read400press', 'read100press', 'muck600press', 'read900press', 'read200press', 'help300press', 'truck500press', 'read800press']
, 'ign': ['truck200biothign', 'muck500biothign', 'help800biothign', 'muck700biothign', 'help600biothign', 'truck300biothign', 'read200biothign', 'help500biothign', 'read900biothign', 'read700biothign', 'truck400biothign', 'help300biothign', 'read400biothign', 'truck500biothign', 'read800biothign', 'help700biothign', 'help400biothign', 'sleep600biothign', 'sleep500biothign', 'muck300biothign', 'truck700biothign', 'help200biothign', 'sleep300biothign', 'muck100biothign', 'sleep800biothign', 'muck200biothign', 'sleep400biothign', 'truck100biothign', 'muck800biothign', 'read500biothign', 'truck900biothign', 'muck600biothign', 'truck800biothign', 'sleep100biothign', 'read300biothign', 'read100biothign', 'help900biothign', 'truck600biothign', 'help100biothign', 'read600biothign', 'muck400biothign', 'muck900biothign', 'sleep900biothign', 'sleep200biothign', 'sleep700biothign']
}

创建和重新排列字典

3 个答案: