我知道conv.py中的reset_parameters()负责默认权重初始化,
我将功能更改为
def reset_parameters(self):
n = self.in_channels
for k in self.kernel_size:
n *= k
stdv = 1. / math.sqrt(n)
print('reset w, stdv=',stdv)
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
print('reset b, stdv=',stdv)
self.bias.data.uniform_(-stdv, stdv)
print('w:',self.weight.data.norm(), 'b:',self.bias.data.norm())
在我的模型创建之后,我应用以下手动权重weight_init()来更改我的转换层的权重初始化方法
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
print(m)
print(m.weight.data.norm())
print(m.bias.data.norm())
std_w = m.weight.size(1) * m.weight.size(2) * m.weight.size(3)
std_b = m.weight.size(0)
std_w = 1. / math.sqrt(std_w)
# std_b = 1. / math.sqrt(std_b)
std_b = std_w
m.weight.data.uniform_(-std_w, std_w)
m.bias.data.uniform_(-std_b, std_b)
print(m.weight.data.norm())
print(m.bias.data.norm())
print('\n\n')
我认为它们是相同的,但我发现它在reset_params函数中打印下面的信息。
重置w,stdv = 0.19245008972987526重置b,stdv = 0.19245008972987526 w:4.651364750286455 b:0.9658572124243059重置w,stdv = 0.041666666666666664复位B,STDV =0.041666666666666664瓦特:4.60668514079571 B:0.19021859795685142复位瓦特,STDV = 0.041666666666666664复位B,STDV =0.041666666666666664瓦特:6.529658534196003 B:0.24801288097906313复位瓦特,STDV = 0.029462782549439483复位B,STDV =0.029462782549439483瓦特:6.544403663970284 B:0.20246035190569983复位瓦特,STDV = 0.029462782549439483复位b,STDV =0.029462782549439483瓦特:9.237618805061214 b:0.2699324704165474复位瓦特,STDV = 0.020833333333333332复位b,STDV =0.020833333333333332瓦特:9.240560902888104 b:0.18776950085546212复位瓦特,STDV = 0.020833333333333332复位b,STDV =0.020833333333333332瓦特:9.23323252375467 b:0.19598698034213305复位瓦特,STDV = 0.020833333333333332复位b,STDV =0.020833333333333332瓦特:9.247914516750834 b:0.19991090497324737复位瓦特,STDV = 0.020833333333333332复位b,STDV =0.020833333333333332瓦特:13.062441360447233 b:0.2709608856088436复位瓦特,STDV = 0.014731391274719742复位b,STDV = 0.014731391274719742 w:13.058955297303 523 B:0.19297756771652977复位瓦特,STDV = 0.014731391274719742复位B,STDV =0.014731391274719742瓦特:13.064573213326009 B:0.19342352625500445复位瓦特,STDV = 0.014731391274719742复位B,STDV =0.014731391274719742瓦特:13.060771314609305 B:0.1931597201764238复位瓦特,STDV = 0.014731391274719742复位B,STDV =0.014731391274719742瓦特:13.068217941106957 b:0.1944194648771781复位瓦特,STDV = 0.014731391274719742复位b,STDV =0.014731391274719742瓦特:13.064472494773318 b:0.1871614021517605复位瓦特,STDV = 0.014731391274719742复位b,STDV =0.014731391274719742瓦特:13.065174600640301 b:0.19473828458164352复位瓦特,STDV = 0.014731391274719742重置b,stdv = 0.014731391274719742 w:13.064107007871193 b:0.18669129860732317
但在我的weights_init()期间,它打印在info:
下面Conv2d(3,64,kernel_size =(3,3),stride =(1,1),padding =(1,1)) 2.5413928031921387 0.0 4.599651336669922 0.8222851753234863
Conv2d(64,64,kernel_size =(3,3),stride =(1,1),padding =(1,1)) 11.280376434326172 0.0 4.624712944030762 0.18499651551246643
Conv2d(64,128,kernel_size =(3,3),stride =(1,1),padding =(1,1)) 11.323299407958984 0.0 6.527068614959717 0.2626609206199646
Conv2d(128,128,kernel_size =(3,3),stride =(1,1),padding =(1,1)) 16.010761260986328 0.0 6.516138553619385 0.18262024223804474
Conv2d(128,256,kernel_size =(3,3),stride =(1,1),padding =(1,1)) 16.00145149230957 0.0 9.22119426727295 0.2743944823741913等...
显然,它已在某处改变,例如,偏差至少改为0。 我不知道源代码中是否有另外一点修改了conv权重初始化,谁能向我解释这个问题将不胜感激!我的英语很差,希望你能理解它:微笑: