Question

我刚刚写了第一篇尝试，通过使用能耗功能为家庭分类提供神经网络。到目前为止，我可以让它运行，但输出似乎有问题。所以你可以看到我使用了18个功能（可能很多？）来预测它是单个还是非单个家庭。

我有这样的3488行：

c_day     c_weekend c_evening c_morning c_night c_noon c_max c_min r_mean_max r_min_mean r_night_day r_morning_noon
 12        14      1826         9     765      3   447     2        878          0        7338              4
r_evening_noon t_above_1kw t_above_2kw t_above_mean t_daily_max single
 3424           1           695         0       174319075712881     1

我的神经网络使用这些参数：

net.nn <- neuralnet(single
            ~ c_day
            + c_weekend
            + c_weekday 
            + c_evening
            + c_morning
            + c_night
            + c_noon
            + c_max
            + c_min
            + r_mean_max
            + r_min_mean
            + r_night_day
            + r_morning_noon
            + r_evening_noon
            + t_above_1kw
            + t_above_2kw
            + t_above_mean
            + t_daily_max
            ,train, hidden=15, threshold=0.01,linear.output=F)

1 repetition was calculated.

        Error Reached Threshold Steps
1 126.3425379    0.009899229932  4091

我之前使用min-max标准化公式对数据进行了标准化：

for(i in names(full_data)){
  x <- as.numeric(full_data[,i])
  full_data[,i] <- (x-min(x)/max(x)-min(x))
}

我获得了3488行数据，并将它们分成训练和测试集。

half <- nrow(full_data)/2 
train <- full_data[1:half,]
test <- full_data[half:3488,]

net.results <- compute(net.nn,test)
nn$net.result

我使用了预测方法并将其绑定到实际的“单[y / no]” - 列来比较结果：

predict <- nn$net.result
cleanoutput <- cbind(predict,full_data$single[half:3488])
colnames(cleanoutput) <- c("predicted","actual")

所以当我打印它时，这是前10行的分类结果：

            predicted actual
1701 0.1661093405      0
1702 0.1317067578      0
1703 0.1677147708      1
1704 0.2051188618      1
1705 0.2013035634      0
1706 0.2088726723      0
1707 0.2683753128      1
1708 0.1661093405      0
1709 0.2385537285      1
1710 0.1257108821      0

所以，如果我理解正确，当我绕预测输出时，它应该是0或1，但它总是最终为0！

我使用了错误的参数吗？我的数据是否不适合nn预测？规范化是错误的吗？

Answer 1

使用scale(full_data)获取整个数据就可以了。现在通过标准均值偏差对数据进行归一化，输出看起来更可靠。

Answer 2

这意味着您的模型性能仍然不佳。在调整后达到良好的模型性能后，您应该获得正确的预期行为。神经网络技术非常容易受到不同列之间的尺度差异的影响，因此数据的标准化[mean = 0 std = 1]是一种很好的做法。正如OP scale()所指出的那样。

神经元净精度要低

2 个答案: