Question

我正在尝试根据列表中单词的索引从多个列表创建一个字典。它们的关键是单词本身，关键是单词的索引在列表中。如果单词出现两次，则值类似于[postiion1，position2]。在以下示例中。第二个列表有两次所以'the'它将是'the'：[2,6] 等

def create_myDict(mylists):       
    myDic = {}
    for t in mylists:
        myDic[t[0]] = t[:]
    return myDic

示例输入：

[['why', 'did', 'the', 'dalmation', 'need', 'glasses', 'he', 'was', 'seeing', 'spots']
,['what', 'did', 'the', 'book', 'say', 'to', 'the', 'page', 'don', 't', 'turn', 'away', 'from', 'me']]

预期输出：

myDic ={ 'what':[0], 'did':[1], 'the':[2,6],
'book':[3],'say':[4],'to':[5],'page':[7],'don':[8]
.....
}

但它不起作用。任何想法。

Answer 1

我认为这会做你需要的。每个单词的列表长度是相同的，并且将包含None或每个相应列表的索引列表：

import collections

def create_myDict(*lists):
    """ Map words to their index positions in each of the lists. """
    # initialize results
    unique_words = {word for words in lists for word in words}
    results = {word: [[] for _ in xrange(len(lists))] for word in unique_words}

    for list_num, words in enumerate(lists):
        indices = collections.defaultdict(list)
        for index, word in enumerate(words):
            indices[word].append(index)
        for word in indices:
            results[word][list_num] = indices[word]

    # return results with empty lists converted to None
    return {word: [None if not index else index for index in indicies]
                for word, indicies in results.iteritems()}

list1 = ['why', 'did', 'the', 'dalmation', 'need', 'glasses', 'he', 'was',
         'seeing', 'spots']
list2 = ['what', 'did', 'the', 'book', 'say', 'to', 'the', 'page', 'don\'t',
         'turn', 'away', 'from', 'me']

print 'create_myDict(list1, list2) = {'
for item in sorted(create_myDict(list1, list2).items()):
    print '    {!r}: {},'.format(*item)
print '}'

样本列表的输出：

create_myDict(list1, list2) = {
    'away': [None, [10]],    
    'book': [None, [3]],     
    'dalmation': [[3], None],
    'did': [[1], [1]],       
    "don't": [None, [8]],    
    'from': [None, [11]],    
    'glasses': [[5], None],  
    'he': [[6], None],       
    'me': [None, [12]],      
    'need': [[4], None],     
    'page': [None, [7]],     
    'say': [None, [4]],      
    'seeing': [[8], None],   
    'spots': [[9], None],    
    'the': [[2], [2, 6]],    
    'to': [None, [5]],       
    'turn': [None, [9]],     
    'was': [[7], None],      
    'what': [None, [0]],     
    'why': [[0], None],      
}

<强>更新

如果您的输入是列表列表，正如您在评论中提到的那样，只需执行以下操作：

myinput = [['what', 'did'], ['why', 'did', 'the', 'strawberry']]

print 'create_myDict(*myinput) = {'
for item in sorted(create_myDict(*myinput).items()):
    print '    {!r}: {},'.format(*item)
print '}'

输出：

create_myDict(*myinput) = {
    'did': [[1], [1]],
    'strawberry': [None, [3]],
    'the': [None, [2]],
    'what': [[0], None],
    'why': [None, [0]],
}

Answer 2

此函数将创建一个字典，其中单词为键，其索引列表为值：

def create_myDict(mylists):
    myDict = {}
    for sublist in mylists: 
        for i in range(len(sublist)):
            if sublist[i] in myDict:
               myDict[sublist[i]].append(i)
            else:
               myDict[sublist[i]] = [i]

    return myDict

同样的事情，setdefault：

稍短

def create_myDict(mylists):
    myDict = {}
    for sublist in mylists: 
        for i in range(len(sublist)):
            myDict.setdefault(sublist[i], []).append(i)
    return myDict

如果您根本不想检查密钥的存在，还有collections.defaultdict：

from collections import defaultdict

def create_myDict(mylists):
    myDict = defaultdict(list)
    for sublist in mylists: 
        for i in range(len(sublist)):
            myDict[sublist[i]].append(i)
    return myDict

Answer 3

也许

my_word_list = ['why', 'did', 'the', 'dalmation', 'need', 'glasses', 'he', 'was', 'seeing', 'spots']
order_occurance = dict(zip(my_word_list,xrange(1000000000)))

from collections import Counter
count_occurance = Counter(my_word_list)

Answer 4

此代码应该有效：

def cd(mylists):
    myDic = {}
    for t in mylists:
        if t in myDic.keys():
            myDic[t]+=1
        else:
            myDic[t]=1
    return myDic

从多个列表创建字典

4 个答案: