Question

我正在建立一个利用T-distribution noise的神经网络。我正在使用在numpy库np.random.standard_t中定义的函数和在tensorflow tf.distributions.StudentT中定义的函数。第一个功能的文档链接为here，第二个功能的文档链接为here。我正在使用如下所述的功能：

a = np.random.standard_t(df=3, size=10000)  # numpy's function

t_dist = tf.distributions.StudentT(df=3.0, loc=0.0, scale=1.0)
sess = tf.Session()
b = sess.run(t_dist.sample(10000))

在为Tensorflow实现提供的文档中，有一个名为scale的参数，其描述为

分布的比例因子。请注意，尺度从技术上讲并不是这种分布的标准偏差，但其语义更类似于标准偏差而不是方差。

我将scale设置为1.0，但是我无法确定它们是否指向相同的发行版。

有人可以帮我验证一下吗？谢谢

Answer 1

我要说的是，因为在两种情况下，它们的采样方法几乎完全相同。 tf.distributions.StudentT的采样是这样定义的：

def _sample_n(self, n, seed=None):
  # The sampling method comes from the fact that if:
  #   X ~ Normal(0, 1)
  #   Z ~ Chi2(df)
  #   Y = X / sqrt(Z / df)
  # then:
  #   Y ~ StudentT(df).
  seed = seed_stream.SeedStream(seed, "student_t")
  shape = tf.concat([[n], self.batch_shape_tensor()], 0)
  normal_sample = tf.random.normal(shape, dtype=self.dtype, seed=seed())
  df = self.df * tf.ones(self.batch_shape_tensor(), dtype=self.dtype)
  gamma_sample = tf.random.gamma([n],
                                 0.5 * df,
                                 beta=0.5,
                                 dtype=self.dtype,
                                 seed=seed())
  samples = normal_sample * tf.math.rsqrt(gamma_sample / df)
  return samples * self.scale + self.loc  # Abs(scale) not wanted.

因此，这是一个标准的标准样本，除以参数为df的卡方样本的平方根除以df。卡方样本被当作具有参数0.5 * df和速率0.5的伽马样本，这是等效的（卡方是伽马的一种特例）。像scale一样，loc值仅在最后一行起作用，作为在某个点和规模“重新定位”分布样本的方法。当scale为1且loc为零时，它们什么也不做。

这是np.random.standard_t的实现：

double legacy_standard_t(aug_bitgen_t *aug_state, double df) {
  double num, denom;

  num = legacy_gauss(aug_state);
  denom = legacy_standard_gamma(aug_state, df / 2);
  return sqrt(df / 2) * num / sqrt(denom);
})

基本上是一样的东西，略作措辞。在这里，我们还有一个形状为df / 2的伽玛，但它是标准的（比率一）。但是，分子0.5中的分子/ 2现在丢失了sqrt。因此，它只是移动数字。不过，这里没有scale或loc。

实际上，区别在于，在TensorFlow的情况下，分布实际上是noncentral t-distribution。一个简单的凭经验证明loc=0.0和scale=1.0相同的方法是绘制两种分布的直方图，并查看它们看起来有多接近。

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
np.random.seed(0)
t_np = np.random.standard_t(df=3, size=10000)
with tf.Graph().as_default(), tf.Session() as sess:
    tf.random.set_random_seed(0)
    t_dist = tf.distributions.StudentT(df=3.0, loc=0.0, scale=1.0)
    t_tf = sess.run(t_dist.sample(10000))
plt.hist((t_np, t_tf), np.linspace(-10, 10, 20), label=['NumPy', 'TensorFlow'])
plt.legend()
plt.tight_layout()
plt.show()

输出：

那看起来很近。显然，从统计样本的角度来看，这不是任何一种证明。如果您仍然不确定，那么可以使用一些统计工具来测试一个样本是来自某个分布还是两个样本来自同一个分布。

这些功能是否等效？

1 个答案: