我试图遍历每个州有关医院的数据列表,并提取与指定排名(num参数)匹配的医院。我需要返回一个包含医院和州两列的数据框,该框为每个州的医院提供了针对特定结果的指定排名,因此应该为50行。
问题是我正在返回一个数据帧,其中只有一行包含最后一个状态(WY)的数据。
我知道在将字符向量医院和州级联之前,我的代码可以完美地满足我的需求。
rankall <- function(outcome, num = "best") {
data <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
newframe <- as.data.frame(cbind(data[, 2], data[, 7], data[, 11], data[, 17], data[, 23]), stringsAsFactors = F)
colnames(newframe) <- c("hospital", "state", "heart attack", "heart failure", "pneumonia")
splitstates <- split(newframe, newframe$state)
if (sum(outcome == "heart attack" | outcome == "pneumonia" | outcome == "heart failure") == 0) {
stop("invalid outcome")
}
hospitals <- character()
states <- character()
for(i in length(splitstates)) {
orderoutcome <- order(splitstates[[i]][, eval(outcome)], splitstates[[i]][, "hospital"], na.last = TRUE)
if(num == "best") {
num2 <- 1
rank <-orderoutcome[num2]
} else if(num == "worst") {
num2 <- length(orderoutcome)
rank <- orderoutcome[num2]
} else {
rank <- orderoutcome[num]
}
result <- splitstates[[i]][rank, "hospital"]
hospitals <- c(hospitals, result)
states <- c(states, splitstates[[i]][1, "state"])
}
return <- data.frame(hospitals, states)
print(return)
}
预期:每个状态的数据框都有一行
实际:具有对应于最后状态(WY)的一行的数据帧
答案 0 :(得分:0)
请考虑对代码进行重构,以避免数据帧构造中的冗余,循环中增长的向量以及迭代项所需的簿记。
使用{em> apply 族方法for
代替by
循环,tapply
是split
的面向对象包装。这类似于lapply
+ split
(或者在您的情况下为for
+ rankall <- function(outcome, num = "best") {
if !(outcome %in% c("heart attack", "pneumonia", "heart failure")) {
stop("invalid outcome")
}
data <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
newframe <- setNames(data[, c(2,7,11,17,23)],
c("hospital", "state", "heart attack", "heart failure", "pneumonia"))
# ORDER ENTIRE DATA FRAME BY STATE, OUTCOME, AND HOSPITAL
newframe <- with(newframe, newframe[order(state, df[[outcome]], hospital),]
row.names(newframe) <- NULL
# BUILD LIST OF 50 DFs FOR EACH STATE SUBSET
df_list <- by(newframe, newframe$state, function(sub) {
# CONDITIONALLY ASSIGN ROW SLICE
if(num == "best") {
df <- head(sub, 1)
} else if(num == "worst") {
df <- tail(sub, 1)
} else {
df <- sub[num,]
}
return(df[c("hospital", "state")])
})
final_df <- do.call(rbind, unname(df_list))
row.names(final_df) <- NULL
return(final_df)
}
)来建立所有州排名医院的最终数据框架。
string[]
Rextester demo (具有5个州的随机种子数据)