在for循环中连接字符向量时出现问题-仅返回最终对象

时间:2019-05-31 18:28:15

标签: r for-loop concatenation

我试图遍历每个州有关医院的数据列表,并提取与指定排名(num参数)匹配的医院。我需要返回一个包含医院和州两列的数据框,该框为每个州的医院提供了针对特定结果的指定排名,因此应该为50行。

问题是我正在返回一个数据帧,其中只有一行包含最后一个状态(WY)的数据。

我知道在将字符向量医院和州级联之前,我的代码可以完美地满足我的需求。

rankall <- function(outcome, num = "best") {
    data <- read.csv("outcome-of-care-measures.csv", colClasses = "character")
    newframe <- as.data.frame(cbind(data[, 2], data[, 7], data[, 11], data[, 17], data[, 23]), stringsAsFactors = F)
    colnames(newframe) <- c("hospital", "state", "heart attack", "heart failure", "pneumonia")
    splitstates <- split(newframe, newframe$state)

    if (sum(outcome == "heart attack" | outcome == "pneumonia" | outcome == "heart failure") == 0) {
        stop("invalid outcome")
    }
    hospitals <- character()
    states <- character()

    for(i in length(splitstates)) {
        orderoutcome <- order(splitstates[[i]][, eval(outcome)], splitstates[[i]][, "hospital"], na.last = TRUE)
        if(num == "best") {
            num2 <- 1
            rank <-orderoutcome[num2]
        } else if(num == "worst") {
            num2 <- length(orderoutcome)
            rank <- orderoutcome[num2]
        } else {
            rank <- orderoutcome[num] 
        }
        result <- splitstates[[i]][rank, "hospital"]
        hospitals <- c(hospitals, result)
        states <- c(states, splitstates[[i]][1, "state"])
    }
    return <- data.frame(hospitals, states)
    print(return)
}

预期:每个状态的数据框都有一行

实际:具有对应于最后状态(WY)的一行的数据帧

1 个答案:

答案 0 :(得分:0)

请考虑对代码进行重构,以避免数据帧构造中的冗余,循环中增长的向量以及迭代项所需的簿记。

使用{em> apply 族方法for代替by循环,tapplysplit的面向对象包装。这类似于lapply + split(或者在您的情况下为for + rankall <- function(outcome, num = "best") { if !(outcome %in% c("heart attack", "pneumonia", "heart failure")) { stop("invalid outcome") } data <- read.csv("outcome-of-care-measures.csv", colClasses = "character") newframe <- setNames(data[, c(2,7,11,17,23)], c("hospital", "state", "heart attack", "heart failure", "pneumonia")) # ORDER ENTIRE DATA FRAME BY STATE, OUTCOME, AND HOSPITAL newframe <- with(newframe, newframe[order(state, df[[outcome]], hospital),] row.names(newframe) <- NULL # BUILD LIST OF 50 DFs FOR EACH STATE SUBSET df_list <- by(newframe, newframe$state, function(sub) { # CONDITIONALLY ASSIGN ROW SLICE if(num == "best") { df <- head(sub, 1) } else if(num == "worst") { df <- tail(sub, 1) } else { df <- sub[num,] } return(df[c("hospital", "state")]) }) final_df <- do.call(rbind, unname(df_list)) row.names(final_df) <- NULL return(final_df) } )来建立所有州排名医院的最终数据框架。

string[]

Rextester demo (具有5个州的随机种子数据)