“ TensorFlow概率中的概率层回归”问题

时间:2020-02-26 14:01:04

标签: python tensorflow tensorflow-probability

我在使用tfp.layers.DistributionLambda时遇到问题,我是TF新手,正在努力使张量流动。 有人可以提供一些有关如何设置输出分布的参数的见解吗?

上下文:

TFP团队在Regression with Probabilistic Layers in TensorFlow Probability上写了一个教程,它建立了以下模型:

# Build model.
model = tfk.Sequential([
  tf.keras.layers.Dense(1 + 1),
  tfp.layers.DistributionLambda(
      lambda t: tfd.Normal(loc=t[..., :1],
                           scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))),
])

我的问题:

它使用tfp.layers.DistributionLambda输出正态分布,但我不清楚tfd.Normal的参数(均值/位置和标准差/小数位数)是如何设置的,因此我无法将Normal更改为a伽玛分布。我尝试了以下操作,但没有成功(预测的分发参数为nan)。

def dist_output_layer (t, softplus_scale=0.05):
    """Create distribution with variable mean and variance
    """
    mean = t[..., :1]
    std_dev = 1e-3 + tf.math.softplus(softplus_scale * mean)

    alpha = (mean/std_dev)**2
    beta = alpha/mean

    return tfd.Gamma(concentration = alpha, 
                     rate = beta
                    )

# Build model.
model = tf.keras.Sequential([
    tf.keras.layers.Dense(20,activation="relu"), # "By using a deeper neural network and introducing nonlinear activation functions, however, we can learn more complicated functional dependencies!
    tf.keras.layers.Dense(1 + 1), #two neurons here b/c the output layer's distribution's mean and std. deviation
    tfp.layers.DistributionLambda(dist_output_layer)
])

非常感谢。

2 个答案:

答案 0 :(得分:2)

老实说,关于你从 Medium 粘贴的代码片段有很多话要说。

不过,我希望你会发现我下面的评论有些用处。

# Build model.
model = tfk.Sequential([

    # The first layer is a Dense layer with 2 units, one for each of the parameters that will
    # be learnt (see next layer). Its implied shape is (batch_size, 2).
    # Note that this Dense layer has no activation function as we want are any real value that will be used
    # to parameterize the Normal distribution in the Normal distribution component of the following
    # layer
    tf.keras.layers.Dense(1 + 1),

    # The following layer is a DistributionLambda that encapsulates a Normal distribution. The
    # DistributionLambda takes a function in its constructor, and this function should take the output
    # tensor from the previous layer as its input (this is the Dense layer and the comments above).
    # The goal is to learn the 2 parameters of the distribution that is loc (the mean) and scale (the standard
    # deviation). For this, a lambda construct is used. The ellipsis you can see for the loc
    # and scale arguments (that is the 3 dots) are for the batch size. Also note that scale (the standard deviation)
    # cannot be negative. The softplus function was used to make sure that the learnt parameter scale doesn't get
    # negative.
    tfp.layers.DistributionLambda(
      lambda t: tfd.Normal(loc=t[..., :1],
                       scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))),
]) 

答案 1 :(得分:1)

关于添加 .05 的问题,这是解决一些没有它可能出现的梯度问题的一个小补偿。基本上之前说过,我们确信实际可变性不小于 epsilon(此处为 0.05),因此我们将通过添加它来确保 std dev 永远不会更小。

https://github.com/tensorflow/probability/issues/751

货币报价:

“如果无穷小的尺度最终成为给定任务的实践问题,我们通常使用的解决方法是软加和移位,例如 scale = epsilon + tf.math.softplus(unconstrained_scale),其中 epsilon 是一些我们先验确信的像 1e-5 这样的微小值远小于真实比例。”

编辑:由于我上面描述的原因,实际上添加的是 1e-3。至于乘法......可能再次只是缩放或梯度调整。或者让比例参数从某个大小开始。