我想编写函数,返回一个2列的数据框 包含在每个州中具有在num。
中指定的排名的医院Rankall有两个参数:结果名称(结果)和医院排名 (NUM)。该函数读取outcome-of-care-measures.csv文件并返回2列数据框 包含在每个州中具有在num。
中指定的排名的医院rankall <- function(outcome, num = "best") {
## Read outcome data
## Check that state and outcome are valid
## For each state, find the hospital of the given rank
## Return a data frame with the hospital names and the
## (abbreviated) state name
}
head(rankall("heart attack", 20), 10)
hospital state
AK <NA> AK
AL D W MCMILLAN MEMORIAL HOSPITAL AL
AR ARKANSAS METHODIST MEDICAL CENTER AR
4
AZ JOHN C LINCOLN DEER VALLEY HOSPITAL AZ
CA SHERMAN OAKS HOSPITAL CA
CO SKY RIDGE MEDICAL CENTER CO
CT MIDSTATE MEDICAL CENTER CT
DC <NA> DC
DE <NA> DE
FL SOUTH FLORIDA BAPTIST HOSPITAL FL
我的功能正常,但最后一步(形成2列数据框)我是通过以下循环完成的:
new_data <- vector()
for(i in sort(unique(d$State))){
new_data <- rbind(new_data,cbind(d$Hospital.Name[which(d$State == i)][num],i))
}
new_data <- as.data.frame(new_data)
这是正确的,但我知道,可以通过lapply
函数编码相同的循环
我的尝试错了:
lapply(d,function(x) x <-rbind(x,d$Hospital.Name[which(d$State == i)][num]))
我该如何解决?
答案 0 :(得分:1)
我假设您的d
数据已经排序:
new_data <- do.call(rbind,
lapply(unique(d$State),
function(state){
data.frame(State = state,
Hospital.Name = d$Hospital.Name[which(d$State==state)][num],
stringsAsFactors = FALSE)
}))