Python实现K-SVD算法

时间:2018-01-11 08:02:10

标签: python dictionary machine-learning sparse-matrix unsupervised-learning

我目前正在尝试使用基于以下论文的Python应用K-SVD算法:http://www.cs.technion.ac.il/~elad/publications/journals/2004/32_KSVD_IEEE_TSP.pdf。到目前为止,我已经实现了下面的代码,它似乎在为稀疏表示创建一个好的字典方面工作得很好。我创建的代码基于Python中存在的实现,https://github.com/nel215/ksvd/blob/master/ksvd/init.py。然而我修改,因为我使用K-svd执行字典学习的不同类型的信号相比,一个普通的K-svd应该用于(我试图创建一个稀疏字典的机器振动信号,而不是正常图像信号)。我的代码如下所示:

def Phi_designer_k_svd(Phi_list, y_list, index):
    '''
    # y_list is expected to have size of k x y_freq (k rows of y_freq sized horizontal signals)
    # mp_process in here is a self-implemented OMP function, loosely following
    # an implementation here: https://github.com/davebiagioni/pyomp/blob/master/omp.py
    '''
    # (mxn) = size of Phi
    Phi = Phi_list[index]

    for i in range(0,1000):
        Phi_old = np.zeros(Phi.shape)
        Phi_old[:, :] = Phi[:, :]
        x_mp = np.zeros((n,k))

        #for every column of x_mp, calculate the sparse representation of the column
        #(find the representation of x_mp that would give minimum solution for ||y - Phi*x_mp||)
        for j in range(0, k): 
            # find approximation of x signal
            x_mp[:,j], _, _ = mp_process(Phi, y_cur[:,j], ncoef=sparsity, verbose=vbose)

        '''for every t-th atom in Phi...
        update the dictionary atoms so that it minimizes errors obtained from compressed x_mp
        '''
        for t in range(0, n):

            #Choose the COLUMN indexes in the t-th ROW of x_mp that is NON zero!
            #(synonymous with picking signals (which column) of x contributed 
            # directly by the t-th atom of the dictionary)

            I = x_mp[t] != 0

            #if there are no contributions made by this atom for x_mp, then continue to the next atom.
            if np.sum(I) == 0:
                continue


            '''
            # only columns containing nonzero elements from t-th row of x_mp is used (showing indices using t-th atom in dict), 
            # rest are ignored
            '''
            x_copy = x_mp[:, I]

            #zero the t-th row as it will not be used in actual calculation
            x_copy[t] = 0

            #create a copy of Phi with the value of the t-th atom zeroed (to disable contribution from the t-th atom of Phi)
            copy = np.zeros(Phi.shape)
            copy[:] = Phi[:]
            copy[:,t] = 0

            #calculate error produced from contribution of t-th atom only (thus ignoring the rest of the zero elements in initial x_mp.
            error = y_cur[:,I] - np.dot(copy, x_copy)


            #produce a SVD decomp of the obtained error matrix
            U,s,V = np.linalg.svd(error, full_matrices=True)

            Phi[:, t] = U[:, 0]

            '''
            #update only the picked non-zero elements of x_mp (as previously mentioned) to be updated. 
            #(sizes of s and V should have already matched this indices group as well)
            '''
            x_mp[t, I] = s[0] * V[:,0]


        previous_norm = l2_norm(Phi_old)
        detected_norm = l2_norm(Phi)

        norm_diff = previous_norm - detected_norm

        #**** Convergence condition. Not sure correct or not based on paper.
        if abs(norm_diff) < tol:
            break

    Phi_list[index] = Phi
    return

但是我发现这段代码只能创建一个仅适用于数据训练集的字典,并且压缩不适用于任何新的数据集。我怀疑我可能没有正确实现纸张的停止标准(我在上面的代码中标记为&#39; ****&#39;在for循环中的最后一行),因为我我不太确定自己的算法适当的停止标准是什么。有没有人曾经在python中实现过他们自己的K-SVD版本?如果是这样,我真的很感激,如果你可以帮助我检查我实施的停止标准的有效性,也许还有任何额外的技巧,我可以用来确保创建的字典和任何新的信号集之间的更好匹配。

0 个答案:

没有答案