Question

我正在制作一个神经网络，该网络应该具有480个数据点并输出18个数据点。输入是磁场强度，输出是检测到的对象的坐标（如果未检测到对象，则为零），因此没有数据点是真正分类的。出于某种原因，当我训练模型时，我尝试的每个输入都会得到相同的输出，例如：

#region License
###############################################################################################
# Copyright 2020 Frank Lesniak

# Permission is hereby granted, free of charge, to any person obtaining a copy of this software
# and associated documentation files (the "Software"), to deal in the Software without
# restriction, including without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:

# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
###############################################################################################
#endregion License

function Get-HyperVVMGenerationNumberFromWithinVM {
    # Returns an integer indicating the Hyper-V VM generation number of this system
    # Returns 2 if this system is a Hyper-V VM running as a "Generation 2" Hyper-V VM
    # Returns 1 if this system is a Hyper-V VM running as a "Generation 1" Hyper-V VM
    #   Note: Microsoft Virtual Server and Virtual PC VMs can also return 1
    # Returns 0 if this system is a Hyper-V VM, but the generation number could not be
    #   determined due to an error. Usually this would only occur if the VM is Vista/Windows
    #   Server 2008 system and the PowerShell script was run without administrative privileges.
    #   Check the Warning stream for more information.
    #   Note: Microsoft Virtual Server and Virtual PC VMs can also return 0
    # Returns -1 if this system is not a Hyper-V VM
    $boolHyperVVM = Test-ThisSystemIsAHyperVVM
    if ($null -ne $boolHyperVVM) {
        if ($boolHyperVVM) {
            $boolUEFI = Test-UEFISystem
            if ($null -ne $boolUEFI) {
                if ($boolUEFI) {
                    # Hyper-V VM with UEFI
                    # Generation 2
                    2
                } else {
                    # Hyper-V VM not running UEFI
                    # Generation 1
                    1
                }
            } else {
                # Is a Hyper-V VM but could not determine whether UEFI is running
                # Error condition
                0
            }
        } else {
            # Not a Hyper-V VM
            -1
        }
    } else {
        $null
    }
}

我用来生成此模型的代码是：

>>> output2 = loaded_model.predict(X_)
>>> output2[0]
array([0.32035217, 0.3027814 , 0.2977892 , 0.30922157, 0.3294088 ,
       0.40853357, 0.09848618, 0.15266985, 0.29188123, 0.31177315,
       0.4652696 , 0.6406114 , 0.204305  , 0.23156416, 0.19870688,
       0.21269864, 0.28510743, 0.29115945], dtype=float32)
>>> output2[100]
array([0.32035217, 0.3027814 , 0.2977892 , 0.30922157, 0.3294088 ,
       0.40853357, 0.09848618, 0.15266985, 0.29188123, 0.31177315,
       0.4652696 , 0.6406114 , 0.204305  , 0.23156416, 0.19870688,
       0.21269864, 0.28510743, 0.29115945], dtype=float32)

我读到，造成这种情况的某些原因是学习速度过高，使得批次的大小较小的图层无法“训练”。我试图将学习率降低到0.0001，但我仍然能得到相同的结果，据我所知，我的所有层都是可训练的，而最后一个可能是问题的原因是我尚未尝试过的批量大小。我有数千个训练样本，所以也许这是问题所在，并且我即将进行的新一轮训练将其从32个增加到400个，但是也许这个问题是我看不到的其他地方？

我还读到，使用callbacks = ['early_stopping_monitor']在这种情况下是否合适？

编辑：另外kernel_regularizer = regularizers.l2（0.01）术语对此有影响吗？

Answer 1

由于在“密集”层中使用了大量的神经元，因此您过度拟合。请将它们减小到一个合理的范围[512，1024]，并贴在下侧，然后根据需要向上移动。减少层数，仅在必要时增加。

Answer 2

您确定装载的重量也与模型一样。试试

model.load_weights(path)

如果您不负担训练后获得的重量，则该模型将使用随机权重，并且每次都给出相同的预测，因为该模型什么都没学。

Keras模型为所有输入提供相同的输出

2 个答案: