创建和重新排列字典

时间:2018-01-04 13:07:25

标签: python dictionary

我是python的新手!我创建了一个代码,它可以成功打开我的文本文件并对我的100个单词列表进行排序。然后我将这些列在标有stimuli_words的列表中,该列表中没有重复的单词,全部是小写等。

但是我现在想把这个列表转换成一个字典,其中的键在我的单词列表中都是3个字母的结尾,而值是与这些结尾相对应的单词。

例如:去,招聘......',但我只想要其中有超过40个单词对应最后两个字符的单词。到目前为止,我有这段代码:

from collections import defaultdict
fq = defaultdict( int )
for w in stimuli_list:
    fq[w] += 1
print fq

然而,它只是用我的单词返回一本字典,以及它们发生了多少次,这显然是一次。例如,':1,'招聘':1,'驾驶':1。

真的很感激一些帮助!!谢谢!!

3 个答案:

答案 0 :(得分:1)

你可以这样做:

dictionary = {}
words = ['going', 'hiring', 'driving', 'letter', 'better', ...] # your list or words

# Creating words dictionary
for word in words:
    dictionary.setdefault(word[-3:], []).append(word)

# Removing lists that contain less than 40 words:
for key, value in dictionary.copy().items():
    if len(value) < 40:
        del dictionary[key]

print(dictionary)

输出:

{ # Only lists that are longer than 40 words
    'ing': ['going', 'hiring', 'driving', ...],
    'ter': ['letter', 'better', ...],
    ...
}

答案 1 :(得分:0)

由于你在计算单词(因为你的单词是单词),每个单词只能获得1个计数。

您可以创建最后3个字符的键(并使用Counter代替):

import collections

wordlist = ["driving","hunting","fishing","drive","a"]

endings = collections.Counter(x[-3:] for x in wordlist)

print(endings)

结果:

Counter({'ing': 3, 'a': 1, 'ive': 1})

答案 2 :(得分:0)

创建DemoData:

import random 

# seed the same for any run
random.seed(10)

# base lists for demo data
prae = ["help","read","muck","truck","sleep"]
post= ["ing", "biothign", "press"]

# lots of data
parts = [ x+str(y)+z for x in prae for z in post for y in range(100,1000,100)]

# shuffle and take on ever 15th
random.shuffle(parts) 
stimuli_list = parts[::120]

stimuli_list

创建字典
# create key with empty lists
dic = dict(("".join(e[len(e)-3:]),[]) for e in stimuli_list)

# process data and if fitting, fill list
for d in dic:
    fitting = [x for x in parts if x.endswith(d)]   # adapt to only fit 2 last chars

    if len(fitting) > 5:                            # adapt this to have at least n in it
        dic[d] = fitting[:]

for d in [x for x in dic if not dic[x]]: # remove keys with empty lists
    dic.remove(d)

print()    
print(dic)

输出:

{'ess': ['help400press', 'sleep100press', 'sleep600press', 'help100press', 'muck400press', 'muck900press', 'muck500press', 'help800press', 'muck100press', 'read300press', 'sleep400press', 'muck800press', 'read600press', 'help200press', 'truck600press', 'truck300press', 'read700press', 'help900press', 'truck400press', 'sleep200press', 'read500press', 'help600press', 'truck900press', 'truck800press', 'muck200press', 'truck100press', 'sleep700press', 'sleep500press', 'sleep900press', 'truck200press', 'help700press', 'muck300press', 'sleep800press', 'muck700press', 'sleep300press', 'help500press', 'truck700press', 'read400press', 'read100press', 'muck600press', 'read900press', 'read200press', 'help300press', 'truck500press', 'read800press']
, 'ign': ['truck200biothign', 'muck500biothign', 'help800biothign', 'muck700biothign', 'help600biothign', 'truck300biothign', 'read200biothign', 'help500biothign', 'read900biothign', 'read700biothign', 'truck400biothign', 'help300biothign', 'read400biothign', 'truck500biothign', 'read800biothign', 'help700biothign', 'help400biothign', 'sleep600biothign', 'sleep500biothign', 'muck300biothign', 'truck700biothign', 'help200biothign', 'sleep300biothign', 'muck100biothign', 'sleep800biothign', 'muck200biothign', 'sleep400biothign', 'truck100biothign', 'muck800biothign', 'read500biothign', 'truck900biothign', 'muck600biothign', 'truck800biothign', 'sleep100biothign', 'read300biothign', 'read100biothign', 'help900biothign', 'truck600biothign', 'help100biothign', 'read600biothign', 'muck400biothign', 'muck900biothign', 'sleep900biothign', 'sleep200biothign', 'sleep700biothign']
}