直觉与&解释DCT

时间:2018-05-17 05:26:34

标签: python neural-network keras conv-neural-network dct

我正在尝试在keras中实现MNIST图像的DCT。关于同样的问题,我有几个问题:

  1. 当我尝试可视化MNIST数据集的DCT系数时,我看到有黑色背景的图像和白色的图案表示输入图像的频率信息。为什么颜色在感觉黑色背景和白色数字中反转,而不是具有白色背景和黑色数字的原始图像?
  2. 如果我对输入图像的DCT系数进行DCT,则与仅使用DCT一次相比,它给出了表示原始图像的更强的图案(检查附加的图像)。那是为什么?
  3. 这是我的代码:

    import keras
    from keras import backend as K
    from keras.models import Sequential
    from keras.layers import Dense, Dropout, Activation, Flatten, Add
    from keras.layers import Convolution2D, MaxPooling2D
    from keras.utils import np_utils
    from keras.layers.core import Lambda
    from keras.datasets import mnist
    from PIL import Image
    import numpy as np
    import matplotlib.pyplot as plt
    %matplotlib inline
    
    (X_train, y_train), (X_test, y_test) = mnist.load_data()
    X_train = X_train.astype('float32')
    X_test = X_test.astype('float32')
    X_train /= 255
    X_test /= 255
    
    model = Sequential()
    model.add(Lambda(lambda x: K.tf.spectral.dct(K.transpose(K.tf.spectral.dct(K.transpose(x), type=2, norm='ortho')), type=2, norm='ortho') ,input_shape=(28, 28,1), output_shape=(28,28,1)))
    model.add(Lambda(lambda x: K.tf.spectral.dct(K.transpose(K.tf.spectral.dct(K.transpose(x), type=2, norm='ortho')), type=2, norm='ortho'),input_shape=(28, 28,1), output_shape=(28,28,1)))
    
    X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
    
    viz_dct = model.predict(X_train[:len(X_train)//2])
    
    def get_reconstructed_image(coeff):
        coeff = coeff*255
        img = Image.fromarray(coeff)
        return img
    
    print(viz_dct.shape)
    viz_dct = viz_dct.reshape(viz_dct.shape[0],viz_dct.shape[1],viz_dct.shape[2])
    plt.imshow(get_reconstructed_image(viz_dct[5]))
    

    以下是上述程序的输出图像:

    原始图片:

    This is the original image from dataset

    在第一个dct之后:

    This is the output of 1st dct shown as an image

    第二个dct之后:

    Output after the 2nd dct

1 个答案:

答案 0 :(得分:0)

通常,图像值是8位,8位的DCT输出是16位。拍摄图像并对其进行DCT然后显示它可能会导致像素值超出范围。