矢量量化(对流浮点矢量到短矢量)

时间:2017-11-29 11:21:10

标签: python numpy vector scikit-learn data-science

我想将浮动矢量量化为短矢量。我在网上进行了研究,发现了很多矢量量化算法,比如LBG。但是,我仍然不明白如何将浮点矢量空间映射到短矢量空间。所以我做了进一步的研究,我发现一篇文章确切地说我想要的东西。

将numpy导入为np 来自sklearn.preprocessing import normalize

 def get_median_values_for_bins(bins):
       median_values = {}
       for binidx in range(1, bins.shape[0]):
           binval = bins[binidx]
           binval_prev = bins[binidx - 1]
           median_values[binidx] = binval_prev

       median_values[bins.shape[0]] = bins[bins.shape[0]-1]
       return median_values

 def get_quantized_features(features, quantization_factor=30):
        normalized_features = normalize(features, axis=1)
        offset = np.abs(np.min(normalized_features))
        offset_features = normalized_features + offset # Making all feature values positive

        # Let's proceed to quantize these positive feature values
        min_val = np.min(offset_features)
        max_val = np.max(offset_features)

        bins = np.linspace(start=min_val, stop=max_val, num=quantization_factor)
        median_values = get_median_values_for_bins(bins)
        original_quantized_features = np.digitize(offset_features, bins)

        quantized_features = np.apply_along_axis(lambda row: map(lambda x: median_values[x], row), 1, original_quantized_features)
        quantized_features = np.floor(quantization_factor*quantized_features)
        return quantized_features



 quantization_factor = 5000 # Adjust this depending on accuracy of quantized features.

 quantized_features = get_quantized_features(features, quantization_factor)

0 个答案:

没有答案