scikit-learn的FastICA如何对每个由高斯噪声组成的两个信号进行去卷积?

时间:2018-04-27 18:23:41

标签: python scikit-learn gaussian noise dimensionality-reduction

不确定是否最好在此处或StackExchange上询问此问题,但由于它是一个编程问题以及可能是一个数学问题,所以这里就是。

问题是关于FastICA。

给定输入时间序列("观察"下面),其中每个时间序列是n_components信号的线性混合,ICA返回信号和混合矩阵。从http://www.cs.jhu.edu/~ayuille/courses/Stat161-261-Spring14/HyvO00-icatut.pdf第3节开始,我知道最多一个信号可能是高斯噪声。但下面我似乎证明了即使两者都是噪声,FastICA也会恢复两个信号(这里是时间序列长度的函数,从一个时间步长到10000个时间步长,16个时间序列):

# Snippet below adapted from http://scikit-learn.org/stable/auto_examples/decomposition/plot_ica_blind_source_separation.html
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
from sklearn.decomposition import FastICA, PCA
for i in [1, 2, 3, 4, 5, 10, 20, 100, 1000, 10000]: # number of timepoints
    # Generate sample data
    np.random.seed(0)
    n_samples = i
    time = np.linspace(0, 8, n_samples)
    #
    s1 = np.array([np.random.normal() for q in range(i)])
    s2 = np.array([np.random.normal() for q in range(i)])
    #
    S = np.c_[s1, s2]
    S += 0.2 * np.random.normal(size=S.shape)  # Add extra noise, just to muddy the signals
    #
    S /= S.std(axis=0)  # Standardize data
    # Mix data
    A = np.array([[np.random.normal(), np.random.normal()] for j in range(16)]) # Mixing matrix
    X = np.dot(S, A.T)  # Generate observations
    #
    # Compute ICA
    ica = FastICA(n_components=2)
    print i, "\t",
    try:
        S_ = ica.fit_transform(X)  # Reconstruct signals
    except ValueError:
        print "ValueError: ICA does not run"
        continue
    A_ = ica.mixing_  # Get estimated mixing matrix
    #
    # We can `prove` that the ICA model applies by reverting the unmixing.
    print np.allclose(X, np.dot(S_, A_.T) + ica.mean_) # X - AS ~ 0

输出:

1   ValueError: ICA does not run
2   False
3   True
4   True
5   True
10  True
20  True
100     True
1000    True
10000   True

为什么这样做?即,为什么X-AS~0(上面的allclose()条件)?请注意,如果我们生成的数据集数量远大于此处使用的数据集(例如,1,000个时间序列仍然有效),它仍然有效。

0 个答案:

没有答案