使用if else语句并参考先前迭代的结果进行循环

时间:2019-12-29 07:05:41

标签: r for-loop

我有一个数据框,其中的字段x包含组名(在下面的示例中标记为字母)和组成员(在组名下的中列出,标为数字)。我想创建一个字段,为每个成员显示其组的名称。在下面的数据框中,所需的输出显示在“结果”列中。

df <- data.frame("x"=c("A","1","2","B","C","1","2","C","D","1"),
                 "outcome"=c("A","A","A","B","C","C","C","C","D","D")
) %>%
  mutate(
    Letter = ifelse(grepl("[A-Za-z]", x) == T,"Letter",
                      "No Letter")
  )

我的想法是通过forloop做到这一点。如果x是一个字母,则应返回该字母,否则应返回前一个循环的结果(即x中找到的前一个字母)。 下面的forloop没有给出正确的输出:

df$outcome_calc[1] <- "A" 
for (i in 2:10) {  
  df$outcome_calc[i] <- ifelse(df$Letter[i] == "No Letter",df$outcome_calc[i-1],df$x[i])    

}

任何想法如何获得正确的输出?

5 个答案:

答案 0 :(得分:2)

这里有两种tidyverse方式,使用便捷功能zoo::na.locf非常相似。

第一

library(tidyverse)

df %>%
  mutate(na = is.na(as.numeric(as.character(x))),
         outcome2 = ifelse(na, as.character(x), NA_character_),
         outcome2 = zoo::na.locf(outcome2)) %>%
  select(-na)

另一个:

df %>%
  mutate(chr = !grepl("[[:digit:]]", x),
         outcome2 = ifelse(chr, as.character(x), NA_character_),
         outcome2 = zoo::na.locf(outcome2)) %>%
  select(-chr)

答案 1 :(得分:1)

以下是使用for循环执行此操作的一种方法:

# keeps track of previous letter
prev = ''

# output
op = c()

for (i in df$x){

    # check the pattern
    check = grepl(pattern = '[a-zA-Z]', x = i, ignore.case = T)

    if(isTRUE(check)){
        op = c(op, i)
        prev = i
    } else {
        op = c(op, prev)
    }

}

print(op)
[1] "A" "A" "A" "B" "C" "C" "C" "C" "D" "D"

答案 2 :(得分:1)

或者,您可以使用for函数来避免sapply循环。

您可以定义字母的位置:

pos_letter <- grep("[A-Za-z]", df$x)

然后,使用sapply至1)为每一行定义字母正上方的位置,最后用对应的字母替换每个值:

df$out <- sapply(1:nrow(df),function(x) max(pos_letter[pos_letter <= x]))
df$out2 <- sapply(df$out, function(x) x = as.character(df[x,"x"]))

   x outcome out out2
1  A       A   1    A
2  1       A   1    A
3  2       A   1    A
4  B       B   4    B
5  C       C   5    C
6  1       C   5    C
7  2       C   5    C
8  C       C   8    C
9  D       D   9    D
10 1       D   9    D

您可以通过编写以下命令,将sapply这两个函数合并在一行中:

sapply(1:nrow(df), function(n) as.character(df[max(pos_letter[pos_letter <= n]),"x"]))

[1] "A" "A" "A" "B" "C" "C" "C" "C" "D" "D"

答案 3 :(得分:1)

使用tidyr::fill-要求提供您的电话号码所在的NA:

df = data.frame(x = c("A","1","2","B","C","1","2","C","D","1"),
                stringsAsFactors = FALSE)

df$x[grepl("[0-9]+", df$x)] = NA

tidyr::fill(df, x)
   x
1  A
2  A
3  A
4  B
5  C
6  C
7  C
8  C
9  D
10 D

答案 4 :(得分:0)

dplyr

这里是dynamic urls的简化版本,不需要创建临时帮助器列。它使用stringr::str_detect()if_else()zoo::na.locf()

library(dplyr)
df %>% 
  mutate(outcome2 = if_else(stringr::str_detect(x, "\\D"), x, factor(NA)) %>% zoo::na.locf())
   x outcome    Letter outcome2
1  A       A    Letter        A
2  1       A No Letter        A
3  2       A No Letter        A
4  B       B    Letter        B
5  C       C    Letter        C
6  1       C No Letter        C
7  2       C No Letter        C
8  C       C    Letter        C
9  D       D    Letter        D
10 1       D No Letter        D

data.table

为了完整起见,这也是我经常使用的data.table方法。它使用引用分配来更新df

library(data.table)
setDT(df)[x %like% "\\D", outcome2 := x][, outcome2 := zoo::na.locf(outcome2)][]
    x outcome    Letter outcome2
 1: A       A    Letter        A
 2: 1       A No Letter        A
 3: 2       A No Letter        A
 4: B       B    Letter        B
 5: C       C    Letter        C
 6: 1       C No Letter        C
 7: 2       C No Letter        C
 8: C       C    Letter        C
 9: D       D    Letter        D
10: 1       D No Letter        D