神经网络和动量

Question

神经网络和动量

动量因子是否应优先与[数据集实例和个体权重]或[仅权重]相关。例如：

def get_momentum( instance, weight ):
   return float

instance1 = vector 1xn
instance2 = vector 1xn
weights   = vector 1xn

# Option 1
get_momentum( instance1, weights[0] ) # eg returns 0.1
get_momentum( instance2, weights[0] ) # eg returns 0.3 <-- same weight, different momentum

# Option 2
get_momentum( instance1, weights[0] ) # eg returns 0.1
get_momentum( instance2, weights[0] ) # eg returns 0.1

第二种选择会降低内存的复杂性。我相信这也会导致学习算法比第一种选择更容易陷入局部最优。选项1应该导致更强劲的动力“拉动”。

Answer 1

测试

我对我的假设做了一些测试。这两种方法似乎表现几乎相同，但使用第一种方法有明显改善。

动量数据结构的内存复杂性：

方法1：O( instances * weights )
方法2：O( weights )

结果：

每轮使用预定义的权重集。两个版本都在相同的重量集上进行训练。

$ pypy backprop.py # First approach
Round: 1/10     Required epochs: 40995
Round: 2/10     Required epochs: 40997
Round: 3/10     Required epochs: 40996
Round: 4/10     Required epochs: 40997
Round: 5/10     Required epochs: 40997
Round: 6/10     Required epochs: 40997
Round: 7/10     Required epochs: 40999
Round: 8/10     Required epochs: 40996
Round: 9/10     Required epochs: 40996
Round: 10/10    Required epochs: 40997

$ pypy backprop.py # Second approach
Round: 1/10     Required epochs: 41070
Round: 2/10     Required epochs: 41072
Round: 3/10     Required epochs: 41069
Round: 4/10     Required epochs: 41069
Round: 5/10     Required epochs: 41070
Round: 6/10     Required epochs: 41071
Round: 7/10     Required epochs: 41072
Round: 8/10     Required epochs: 41069
Round: 9/10     Required epochs: 41070
Round: 10/10    Required epochs: 41071

正如我们可能从测试中读到的那样，第二种方法（具有较低的内存复杂性）需要更多的训练时期才能达到所需的精度。

结论

与较小的培训改进相比，增加的记忆复杂性可能不是一个值得牺牲的。

神经网络的动量

神经网络和动量

1 个答案:

测试

结果：

结论