函数在R中不起作用(“参数不是数字或逻辑”)

时间:2017-10-10 21:53:45

标签: r function

我想在R中执行一个来自以下教科书的函数(第20页,但我在下面发布):media.readthedocs.org/pdf/little-book-of-r-for-multivariate-分析/最新/小书的-R换多变量分析.pdf

我正在尝试的数据集(此PDF中使用的数据集)可在此处找到:

wine <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data",
                   sep=",")

该函数首先定义如下,然后执行(最后一行):

calcBetweenGroupsVariance <- function(variable,groupvariable)
{
# find out how many values the group variable can take
groupvariable2 <- as.factor(groupvariable[[1]])
levels <- levels(groupvariable2)
numlevels <- length(levels)
# calculate the overall grand mean:
grandmean <- mean(variable)
# get the mean and standard deviation for each group:
numtotal <- 0
denomtotal <- 0
for (i in 1:numlevels)
{
leveli <- levels[i]
levelidata <- variable[groupvariable==leveli,]
levelilength <- length(levelidata)
# get the mean and standard deviation for group i:
meani <- mean(levelidata)
sdi <- sd(levelidata)
numi <- levelilength * ((meani - grandmean)^2)
denomi <- levelilength
numtotal <- numtotal + numi
denomtotal <- denomtotal + denomi
}
# calculate the between-groups variance
Vb <- numtotal / (numlevels - 1)
Vb <- Vb[[1]]
return(Vb)
}
calcBetweenGroupsVariance (wine[2],wine[1])

它应该根据三个标签(第一列)给出变量“V2”(第二列)的组间方差。不幸的是,R告诉我:

enter image description here

数据集的结构如下所示:

enter image description here

我不知道如何解决这个问题。根据str(),第二列包含数值数据。我在具有相同问题的另一个数据集上也尝试了此功能。我搜索了这个错误信息,并且有很多基于它的主题,但我无法与我的问题建立任何类比。

如果有人能给我一个提示做什么,我会非常感激!如果您需要更多信息,请告诉我。

提前多多感谢,

2 个答案:

答案 0 :(得分:0)

尝试将na.rm = TRUE添加到您的grandmean <- mean(variable)

答案 1 :(得分:0)

看起来本书的作者就如何将参数传递给函数做出了一些不寻常的决定。在这种情况下,如果传入数据向量而不是要求用户传入整个data.frame,则更有意义(并且通常更有用)。所以,这是对函数本身及其调用方式的改变,应该让它运行。

calcBetweenGroupsVariance <- function(variable, groupvariable) {
  # find out how many values the group variable can take
  groupvariable2 <- as.factor(groupvariable)
  levels <- levels(groupvariable2)
  numlevels <- length(levels)
  # calculate the overall grand mean:
  grandmean <- mean(variable)
  # get the mean and standard deviation for each group:
  numtotal <- 0
  denomtotal <- 0
  for (i in 1:numlevels)
  {
    leveli <- levels[i]
    levelidata <- variable[groupvariable==leveli]
    levelilength <- length(levelidata)
    # get the mean and standard deviation for group i:
    meani <- mean(levelidata)
    sdi <- sd(levelidata)
    numi <- levelilength * ((meani - grandmean)^2)
    denomi <- levelilength
    numtotal <- numtotal + numi
    denomtotal <- denomtotal + denomi
  }
  # calculate the between-groups variance
  Vb <- numtotal / (numlevels - 1)
  Vb <- Vb[[1]]
  return(Vb)
}

然后用

调用它
calcBetweenGroupsVariance (wine[[2]], wine[[1]])
# or 
calcBetweenGroupsVariance (wine$V2, wine$V1)