我正在尝试从数据框中提取数据以进行分析。
heightweight <- function(person, health) {
## Read in data
data <- read.csv("heightweight.csv", header = TRUE,
colClasses = "character")
## Check that the outcomes are valid
measure = c("height", "weight")
if(health %in% measure == FALSE){
stop("Valid inputs are height and weight")
}
## Truncate the data matrix to only what columns are needed
data <- data[c(1, 5, 7)]
## Rename columns
names(data)[1] <- "Name"
names(data)[2] <- "Height"
names(data)[3] <- "Weight"
## Convert numeric columns to numeric
data[, 2] <- as.numeric(data[, 3])
data[, 3] <- as.numeric(data[, 4])
## Convert NAs to 0 after coercion
data[is.na(data)] <- 0
## Check that the name is valid
name <- data[, 1]
name <- unique(name)
if(person %in% name == FALSE){
stop("Invalid person")
}
## Return person with lowest height or weight
list <- data[data$name == person & data[health],]
outcomes <- list[, health]
minumum <- which.min(outcomes)
## Min Rate
minimum[rowNum, ]$name
}
我遇到的问题是
list <- data[data$name == person & data[health],]
也就是说,我运行heightweight("Bob", "weight")
,我收到以下消息
Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
length of 'dimnames' [2] not equal to array extent
我已经用Google搜索了这条消息,并在此检查了一些帖子,但无法确定问题所在。
答案 0 :(得分:3)
除非我遗漏了某些内容,否则如果您只需要给定名称的最低权重或高度,则最后三行代码有点多余。
这是获得特定人员最低健康衡量标准的简单方法:
min(data[data$name==person, "height"])
第一部分仅选择与该人对应的数据行,它充当行索引。逗号后面的第二部分仅选择所需的变量(列)。选择所需数据后,您将在该数据子集中查找最小值。
举例说明结果:
data<-data.frame(name=as.character(c(rep("carlos",2),rep("marta",3),rep("johny",2),"sara")))
set.seed(1)
data$height <- rnorm(8,68,3)
data$weight <- rnorm(8,160,10)
相应的数据框:
name height weight
1 carlos 66.12064 165.7578
2 carlos 68.55093 156.9461
3 marta 65.49311 175.1178
4 marta 72.78584 163.8984
5 marta 68.98852 153.7876
6 johny 65.53859 137.8530
7 johny 69.46229 171.2493
8 sara 70.21497 159.5507
让我们说我们想要玛塔的最小重量:
person <- "marta"
health <- "weight"
最小&#34;重量&#34;对于&#34; marta&#34;是,
min(data[data$name==person,health])
给出了期望的结果:
[1] 153.7876
答案 1 :(得分:0)
以下是您的功能的简化模拟:
heightweight <- function(person,health) {
data.set <- data.frame(names=rep(letters[1:5],each=3),height=171:185,weight=seq(95,81,by=-1))
d1 <- data.set[data.set$name == person,]
d2 <- d1[d1[,health]==min(d1[,health]),]
d2[,c('names',health)]
}
第一行生成样本数据集。第二行选择给定person
的所有记录。最后一行找到与health
的最小值对应的记录。
heightweight('b','height')
# names height
# 4 b 174