我曾经训练神经网络一段时间,但直到最近才使用Caffe。对于我当前的任务,我将here中的引导策略搜索(GPS)算法的原始代码与经过Caffe训练的策略网络一起使用。 GPS可以工作或大致了解预期的策略,但是最初几次迭代的结果对我来说似乎非常可疑和不合逻辑。
简短的问题是,在训练过程中以下两条频繁出现的线是否会成为问题?
I0123 09:09:12.799830 6960 net.cpp:676] Ignoring source layer Python1
I0123 09:09:21.119698 6960 net.cpp:676] Ignoring source layer Python4
更长的问题是网络体系结构正确还是有问题。这可能与前面提到的问题以及网络的图形不符合其描述这一事实有关(请参见下面的其他信息)。
===其他信息===
创建的培训网络具有以下描述:
state {
phase: TRAIN
}
layer {
name: "Python1"
type: "Python"
top: "Python1"
top: "Python2"
top: "Python3"
python_param {
module: "policy_layers"
layer: "PolicyDataLayer"
param_str: "{\"shape\": [{\"dim\": [25, 30]}, {\"dim\": [25, 6]}, {\"dim\": [25, 6, 6]}]}"
}
}
layer {
name: "InnerProduct1"
type: "InnerProduct"
bottom: "Python1"
top: "InnerProduct1"
inner_product_param {
num_output: 42
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "ReLU1"
type: "ReLU"
bottom: "InnerProduct1"
top: "InnerProduct1"
}
layer {
name: "InnerProduct2"
type: "InnerProduct"
bottom: "InnerProduct1"
top: "InnerProduct2"
inner_product_param {
num_output: 42
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "ReLU2"
type: "ReLU"
bottom: "InnerProduct2"
top: "InnerProduct2"
}
layer {
name: "InnerProduct3"
type: "InnerProduct"
bottom: "InnerProduct2"
top: "InnerProduct3"
inner_product_param {
num_output: 6
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "Python4"
type: "Python"
bottom: "InnerProduct3"
bottom: "Python2"
bottom: "Python3"
top: "Python4"
loss_weight: 1
python_param {
module: "policy_layers"
layer: "WeightedEuclideanLoss"
}
}
我通过Netscope可视化网络。 该图像可用here。 请注意,此处未显示blob Python2和Python3,就好像未使用它们一样。
但是,创建网络的Caffe的调试输出似乎使用了所有层。
I0123 09:08:17.156579 6960 layer_factory.hpp:77] Creating layer Python1
I0123 09:08:17.157122 6960 net.cpp:84] Creating Layer Python1
I0123 09:08:17.157131 6960 net.cpp:380] Python1 -> Python1
I0123 09:08:17.157140 6960 net.cpp:380] Python1 -> Python2
I0123 09:08:17.157145 6960 net.cpp:380] Python1 -> Python3
I0123 09:08:17.157940 6960 net.cpp:122] Setting up Python1
I0123 09:08:17.157953 6960 net.cpp:129] Top shape: 25 30 (750)
I0123 09:08:17.157956 6960 net.cpp:129] Top shape: 25 6 (150)
I0123 09:08:17.157959 6960 net.cpp:129] Top shape: 25 6 6 (900)
I0123 09:08:17.157961 6960 net.cpp:137] Memory required for data: 7200
I0123 09:08:17.157964 6960 layer_factory.hpp:77] Creating layer InnerProduct1
I0123 09:08:17.157971 6960 net.cpp:84] Creating Layer InnerProduct1
I0123 09:08:17.157974 6960 net.cpp:406] InnerProduct1 <- Python1
I0123 09:08:17.157979 6960 net.cpp:380] InnerProduct1 -> InnerProduct1
I0123 09:08:17.158669 6960 net.cpp:122] Setting up InnerProduct1
I0123 09:08:17.158679 6960 net.cpp:129] Top shape: 25 42 (1050)
I0123 09:08:17.158682 6960 net.cpp:137] Memory required for data: 11400
I0123 09:08:17.158690 6960 layer_factory.hpp:77] Creating layer ReLU1
I0123 09:08:17.158696 6960 net.cpp:84] Creating Layer ReLU1
I0123 09:08:17.158699 6960 net.cpp:406] ReLU1 <- InnerProduct1
I0123 09:08:17.158704 6960 net.cpp:367] ReLU1 -> InnerProduct1 (in-place)
I0123 09:08:17.158708 6960 net.cpp:122] Setting up ReLU1
I0123 09:08:17.158712 6960 net.cpp:129] Top shape: 25 42 (1050)
I0123 09:08:17.158715 6960 net.cpp:137] Memory required for data: 15600
I0123 09:08:17.158718 6960 layer_factory.hpp:77] Creating layer InnerProduct2
I0123 09:08:17.158722 6960 net.cpp:84] Creating Layer InnerProduct2
I0123 09:08:17.158725 6960 net.cpp:406] InnerProduct2 <- InnerProduct1
I0123 09:08:17.158730 6960 net.cpp:380] InnerProduct2 -> InnerProduct2
I0123 09:08:17.158814 6960 net.cpp:122] Setting up InnerProduct2
I0123 09:08:17.158820 6960 net.cpp:129] Top shape: 25 42 (1050)
I0123 09:08:17.158823 6960 net.cpp:137] Memory required for data: 19800
I0123 09:08:17.158828 6960 layer_factory.hpp:77] Creating layer ReLU2
I0123 09:08:17.158833 6960 net.cpp:84] Creating Layer ReLU2
I0123 09:08:17.158836 6960 net.cpp:406] ReLU2 <- InnerProduct2
I0123 09:08:17.158840 6960 net.cpp:367] ReLU2 -> InnerProduct2 (in-place)
I0123 09:08:17.158843 6960 net.cpp:122] Setting up ReLU2
I0123 09:08:17.158848 6960 net.cpp:129] Top shape: 25 42 (1050)
I0123 09:08:17.158849 6960 net.cpp:137] Memory required for data: 24000
I0123 09:08:17.158851 6960 layer_factory.hpp:77] Creating layer InnerProduct3
I0123 09:08:17.158855 6960 net.cpp:84] Creating Layer InnerProduct3
I0123 09:08:17.158859 6960 net.cpp:406] InnerProduct3 <- InnerProduct2
I0123 09:08:17.158864 6960 net.cpp:380] InnerProduct3 -> InnerProduct3
I0123 09:08:17.158922 6960 net.cpp:122] Setting up InnerProduct3
I0123 09:08:17.158927 6960 net.cpp:129] Top shape: 25 6 (150)
I0123 09:08:17.158929 6960 net.cpp:137] Memory required for data: 24600
I0123 09:08:17.158934 6960 layer_factory.hpp:77] Creating layer Python4
I0123 09:08:17.158954 6960 net.cpp:84] Creating Layer Python4
I0123 09:08:17.158958 6960 net.cpp:406] Python4 <- InnerProduct3
I0123 09:08:17.158962 6960 net.cpp:406] Python4 <- Python2
I0123 09:08:17.158964 6960 net.cpp:406] Python4 <- Python3
I0123 09:08:17.158968 6960 net.cpp:380] Python4 -> Python4
I0123 09:08:17.158999 6960 net.cpp:122] Setting up Python4
I0123 09:08:17.159006 6960 net.cpp:129] Top shape: 1 (1)
I0123 09:08:17.159009 6960 net.cpp:132] with loss weight 1
I0123 09:08:17.159016 6960 net.cpp:137] Memory required for data: 24604
I0123 09:08:17.159020 6960 net.cpp:198] Python4 needs backward computation.
I0123 09:08:17.159024 6960 net.cpp:198] InnerProduct3 needs backward computation.
I0123 09:08:17.159027 6960 net.cpp:198] ReLU2 needs backward computation.
I0123 09:08:17.159030 6960 net.cpp:198] InnerProduct2 needs backward computation.
I0123 09:08:17.159034 6960 net.cpp:198] ReLU1 needs backward computation.
I0123 09:08:17.159036 6960 net.cpp:198] InnerProduct1 needs backward computation.
I0123 09:08:17.159040 6960 net.cpp:200] Python1 does not need backward computation.
I0123 09:08:17.159044 6960 net.cpp:242] This network produces output Python4
I0123 09:08:17.159049 6960 net.cpp:255] Network initialization done.
最后,定义了policy_layers.py模块here