Question

我正在尝试从数据框中提取数据以进行分析。

heightweight <- function(person, health) {
    ## Read in data
    data <- read.csv("heightweight.csv", header = TRUE,
                     colClasses = "character")
    ## Check that the outcomes are valid
    measure = c("height", "weight")
    if(health %in% measure == FALSE){
        stop("Valid inputs are height and weight")
    }
    ## Truncate the data matrix to only what columns are needed
    data <- data[c(1, 5, 7)]
    ## Rename columns
    names(data)[1] <- "Name"
    names(data)[2] <- "Height"
    names(data)[3] <- "Weight"
    ## Convert numeric columns to numeric
    data[, 2] <- as.numeric(data[, 3])
    data[, 3] <- as.numeric(data[, 4])
    ## Convert NAs to 0 after coercion
    data[is.na(data)] <- 0
    ## Check that the name is valid
    name <- data[, 1]
    name <- unique(name)
    if(person %in% name == FALSE){
        stop("Invalid person")
    }
    ## Return person with lowest height or weight
    list <- data[data$name == person & data[health],]
    outcomes <- list[, health]
    minumum <- which.min(outcomes)
    ## Min Rate
    minimum[rowNum, ]$name
}

我遇到的问题是

list <- data[data$name == person & data[health],]

也就是说，我运行heightweight("Bob", "weight")，我收到以下消息

Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr,  : 
  length of 'dimnames' [2] not equal to array extent

我已经用Google搜索了这条消息，并在此检查了一些帖子，但无法确定问题所在。

Answer 1

除非我遗漏了某些内容，否则如果您只需要给定名称的最低权重或高度，则最后三行代码有点多余。

这是获得特定人员最低健康衡量标准的简单方法：

min(data[data$name==person, "height"])

第一部分仅选择与该人对应的数据行，它充当行索引。逗号后面的第二部分仅选择所需的变量（列）。选择所需数据后，您将在该数据子集中查找最小值。

举例说明结果：

data<-data.frame(name=as.character(c(rep("carlos",2),rep("marta",3),rep("johny",2),"sara")))
set.seed(1)
data$height <- rnorm(8,68,3)
data$weight <- rnorm(8,160,10)

相应的数据框：

   name   height   weight
1 carlos 66.12064 165.7578
2 carlos 68.55093 156.9461
3  marta 65.49311 175.1178
4  marta 72.78584 163.8984
5  marta 68.98852 153.7876
6  johny 65.53859 137.8530
7  johny 69.46229 171.2493
8   sara 70.21497 159.5507

让我们说我们想要玛塔的最小重量：

person <- "marta"
health <- "weight"

最小＆＃34;重量＆＃34;对于＆＃34; marta＆＃34;是，

min(data[data$name==person,health])

给出了期望的结果：

[1] 153.7876

Answer 2

以下是您的功能的简化模拟：

heightweight <- function(person,health) {
  data.set <- data.frame(names=rep(letters[1:5],each=3),height=171:185,weight=seq(95,81,by=-1))
  d1 <- data.set[data.set$name == person,]
  d2 <- d1[d1[,health]==min(d1[,health]),]
  d2[,c('names',health)]    
}

第一行生成样本数据集。第二行选择给定person的所有记录。最后一行找到与health的最小值对应的记录。

heightweight('b','height')
#   names height
# 4     b    174

R：从分析数据中提取数据

2 个答案: