Question

我正在将k个最近邻居（k-NN）函数应用于一组不同k值的数据。我想生成一个尽可能多的动态文档，以便可以轻松地将其用于其他数据和k的其他值。

我将for循环中的knn函数应用于k的不同值，并将结果存储在根据k命名的变量中（我在下面的代码中显示）。

当我想重复该代码模式并使用knn函数的输出来检索概率并在运行for循环时根据k给出一个名称时，问题就来了。

数据：

data_train和data_test是具有以下结构的数据帧：

  v1 v2 v3 v4 v5
1  1  0  1  0  1
2  0  1  1  0  1
3  1  1  1  1  1
4  0  1  1  0  0
5  1  0  1  0  1

train lab是一个因素："+" "+" "-" "-" "-"

这是适用于应用knn()函数的代码：

# Using function knn() from class
# the values in ks must have been defined at the top of the document
ks = c(3, 5, 7)


test_pred_names = c()

for (k in ks){

     # Assigning different variable names according to kc
     test_pred<-paste("test_pred_", k, sep = "")

     # Including the argument prob=TRUE outputs the probabilities of     the predicted class together with the predicted class for each test case

     # Assigning the test_pred name to the output of knn
     print(assign(test_pred, knn(train=data_train, test=data_test,     cl=train_lab, k, prob = TRUE)))

     # Storing the names of the variables for each knn output
     test_pred_names = c(test_pred_names, test_pred)
}

# Storing the outputs of knn in a vector under the names defined in test_pred_names
test_preds <-mget(test_pred_names, envir = globalenv())

然后，我使用test_preds中的结果来获得一个向量，该向量仅包含knn的部分输出，即概率：

test_prob_names = c()

# Extracting the probabilities for each of the outputs of knn
for (test_pred in test_preds){

      # assign different variable names according to k
      print(a = names(test_pred)) # this returns NULL

      # Assigning the test_prob name to the output of attr
      print(test_prob<-paste("prob_",a, sep = "")) # this returns "prob_"

      print(assign(test_prob, attr(test_pred, "prob")))
      test_prob_names = c(test_prob_names, test_prob)
}

预期结果：

test_prob_names
> "prob_test_pred_3", "prob_test_pred_5", "prob_test_pred_7"

实际结果：我得到了所需的输出（概率），但没有得到变量的名称。

test_pred_names
> "test_pred_3", "test_pred_5", "test_pred_7"

test_prob_names
> "prob_", "prob_", "prob_"

如果我只是运行它，请在循环之外：

names(test_preds)
> "test_pred_3", "test_pred_5", "test_pred_7"

（其他评论：我实际上更愿意剪切名称，只使用“ test_pred_5”中的数字，这样变量名就类似于“ test_prob_5”，但这是我的第一种方法）。

在for循环内的向量中获取向量的名称

0 个答案: