使用mx.net包r中的mx.mlp时出错

时间:2018-01-28 18:42:19

标签: r neural-network mxnet

我使用

在R版本3.4.3中安装了mx.net软件包
  

cran< - getOption(" repos")

     

开头[" dmlc"]< - " https://s3-us-west-2.amazonaws.com/apache-mxnet/R/CRAN/"

     

选项(repos = cran)

     

install.packages(" mxnet&#34)。

估算神经网络时会出现一些问题。

在下面的代码中,我使用了可用的乳房癌数据集 https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29

BC_data <- read.csv("Data_breastcancer.csv", sep = ";")

# generate a train and test set
trainIndex = sample(1:nrow(BC_data), size = round(0.8*nrow(BC_data)), replace=FALSE)
train_data <- BC_data[trainIndex,]
test_data <- BC_data[-trainIndex,]
X_train <- train_data[,c(-1,-11)]
y_train <- train_data[,11]   

# estimate neural network
model = mx.mlp(as.matrix(X_train), as.numeric(y_train), hidden_node = 10, out_node = 2, out_activation = "softmax", learning.rate = 0.1, num.round = 20)

然而,我得到的唯一输出是

,而不是迭代地返回精度值
Start training with 1 devices
Warning message:
In mx.model.select.layout.train(X, y) :
Auto detect layout of input matrix, use rowmajor..

所以似乎迭代过程根本没有开始。

有人知道如何解决这个问题吗?

1 个答案:

答案 0 :(得分:1)

您需要做一些调整:

  1. MxNet并不适用于任意类的名称。在您的情况下,类标签是“2”和“4”。您需要将它们转换为从0开始更高的数字(参见下面的代码中的示例)
  2. 您需要提供精确度作为mlp的指标。文档中有一个错误:它表示参数名称是“eval_metric”,但实际上它是“eval.metric”,如mx.model.FeedForward.create函数
  3. 初始数据集有“?”作为NA值。 Mlp不适用于NA,因此您需要用某些东西替换它们。我选择了0,但如果需要,你可以使用更具特定领域的东西。
  4. 以下是适用于breast-cancer-wisconsin.data文件的代码,我在您的链接中找到了该文件:

    library(mxnet)
    BC_data <- read.csv(
      file = "breast-cancer-wisconsin.data", 
      sep = ",", 
      header = FALSE, 
      colClasses = c(rep("numeric", 11)), 
      na.strings = c("", "?") # Few records of the 7th column contain "?" - treat "?" as NA
    )
    
    BC_data[is.na(BC_data)] <- 0 # Replace NA with zeroes
    
    # generate a train and test set
    trainIndex = sample(1:nrow(BC_data), size = round(0.8*nrow(BC_data)), replace=FALSE)
    train_data <- BC_data[trainIndex,]
    test_data <- BC_data[-trainIndex,]
    X_train <- train_data[,c(-1,-11)]
    y_train <- train_data[,11]   
    
    # estimate neural network
    model = mx.mlp(
        data = as.matrix(X_train), 
        label = as.numeric(ifelse(y_train == 2, 0, 1)), # Replace classes with 0 and 1
        hidden_node = 10, 
        out_node = 2, 
        out_activation = "softmax", 
        learning.rate = 0.1, 
        num.round = 20,
        array.layout = "rowmajor", # get rid of a nasty warning
        eval.metric=mx.metric.accuracy # set Accuracy as a metric
      )
    

    如果我运行此代码,我会得到以下输出:

    Start training with 1 devices
    [1] Train-accuracy=0.64453125
    [2] Train-accuracy=0.6515625
    [3] Train-accuracy=0.65625
    [4] Train-accuracy=0.9
    [5] Train-accuracy=0.95625
    [6] Train-accuracy=0.95
    [7] Train-accuracy=0.94375
    [8] Train-accuracy=0.9328125
    [9] Train-accuracy=0.93125
    [10] Train-accuracy=0.9328125
    [11] Train-accuracy=0.9375
    [12] Train-accuracy=0.9390625
    [13] Train-accuracy=0.9484375
    [14] Train-accuracy=0.95
    [15] Train-accuracy=0.9453125
    [16] Train-accuracy=0.946875
    [17] Train-accuracy=0.9484375
    [18] Train-accuracy=0.95
    [19] Train-accuracy=0.9484375
    [20] Train-accuracy=0.9515625