lapply而不是r中的循环

时间:2016-10-05 11:19:52

标签: r loops lapply

我想编写函数,返回一个2列的数据框 包含在每个州中具有在num。

中指定的排名的医院

Rankall有两个参数:结果名称(结果)和医院排名 (NUM)。该函数读取outcome-of-care-measures.csv文件并返回2列数据框 包含在每个州中具有在num。

中指定的排名的医院
rankall <- function(outcome, num = "best") {
## Read outcome data
## Check that state and outcome are valid
## For each state, find the hospital of the given rank
## Return a data frame with the hospital names and the
## (abbreviated) state name
}

head(rankall("heart attack", 20), 10)
hospital state
AK <NA> AK
AL D W MCMILLAN MEMORIAL HOSPITAL AL
AR ARKANSAS METHODIST MEDICAL CENTER AR
4
AZ JOHN C LINCOLN DEER VALLEY HOSPITAL AZ
CA SHERMAN OAKS HOSPITAL CA
CO SKY RIDGE MEDICAL CENTER CO
CT MIDSTATE MEDICAL CENTER CT
DC <NA> DC
DE <NA> DE
FL SOUTH FLORIDA BAPTIST HOSPITAL FL

我的功能正常,但最后一步(形成2列数据​​框)我是通过以下循环完成的:

new_data <- vector()
    for(i in sort(unique(d$State))){
        new_data <- rbind(new_data,cbind(d$Hospital.Name[which(d$State == i)][num],i))
    }
new_data <- as.data.frame(new_data)

这是正确的,但我知道,可以通过lapply函数编码相同的循环

我的尝试错了:

lapply(d,function(x) x <-rbind(x,d$Hospital.Name[which(d$State == i)][num]))

我该如何解决?

1 个答案:

答案 0 :(得分:1)

我假设您的d数据已经排序:

new_data <- do.call(rbind,
                    lapply(unique(d$State),
                           function(state){
                              data.frame(State = state,
                                         Hospital.Name = d$Hospital.Name[which(d$State==state)][num],
                                         stringsAsFactors = FALSE)
                       }))