任何人都可以解释这个列表理解吗?

时间:2017-10-15 16:19:26

标签: python numpy machine-learning

def unpack_dict(matrix, map_index_to_word):
    table = sorted(map_index_to_word, key=map_index_to_word.get)      
    data = matrix.data
    indices = matrix.indices
    indptr = matrix.indptr        
    num_doc = matrix.shape[0]    
    return [{k:v for k,v in zip([table[word_id] for word_id in 
    indices[indptr[i]:indptr[i+1]] ],
    data[indptr[i]:indptr[i+1]].tolist())} \
               for i in range(num_doc) ]

wiki['tf_idf'] = unpack_dict(tf_idf, map_index_to_word)

enter image description here

map_index_to_word是单词词典:索引为几千个单词。 tf_idf是TFIDF稀疏向量 DataFrame wiki显示在此处的屏幕截图中

2 个答案:

答案 0 :(得分:3)

[{k: v for k, v in zip([table[word_id] for word_id in indices[indptr[i]:indptr[i + 1]]],data[indptr[i]:indptr[i + 1]].tolist())} for i in range(num_doc)]

与:

相同
final_list = []
for i in range(num_doc):
    new_list = []
    for word_id in indices[indptr[i]:indptr[i + 1]]:
        new_list.append(table[word_id])

    new_dict = {}
    for k, v in zip(new_list, data[indptr[i]:indptr[i + 1]].tolist()):
        new_dict[k] = v
    final_list.append(new_dict)

答案 1 :(得分:3)

此?

[{k:v for k,v in zip([table[word_id] for word_id in 
    indices[indptr[i]:indptr[i+1]] ],
    data[indptr[i]:indptr[i+1]].tolist())} \
               for i in range(num_doc) ]

外在理解是

[... for i in range(num_doc) ]

只需一个简单的循环num_doc次。

里面是字典理解。

{k:v for k,v in zip()}

zip获取k密钥:

[table[word_id] for word_id in indices[indptr[i]:indptr[i+1]] ]

v值来自:

data[indptr[i]:indptr[i+1]].tolist()

因此i外部变量创建切片范围indptr[i]:indptr[i+1]

所以它制作了一个词典列表。字典键来自table[word_id],其中word_id的范围为indices,其值为data的相应范围。