我正在清理从Excel导入的一些数据。我试图根据数据框中行的位置创建一列值。具体来说,我尝试使用mutate()
和ifelse()
为具有特定字符值的两行之间的行分配值。以下是我正在使用的数据的一个非常简化的示例:
a b
[1,] "5" "yes"
[2,] "6" "no"
[3,] "7" "no"
[4,] "2" "yes"
[5,] "apple" NA
[6,] "4" "yes"
[7,] "1" "no"
[8,] "banana" NA
[9,] "6" "yes"
[10,] "3" "yes"
我想创建一个c
列,返回颜色的字符值,其中"apple"
和"banana"
之间的行(行号[6]和[7])被指定c
列值为"red"
,所有其他行的值均为"blue"
。有没有办法做到这一点?如果我能更清楚地解释我的问题,请告诉我!
答案 0 :(得分:1)
使用row_number
包
dplyr
功能
#reproducing example
df <- data.frame(a = c("5","6","7","2","apple","4","1","banana","6","3"), b = c("yes","no","no","yes","NA","yes","no","NA","yes","yes"), stringsAsFactors = FALSE)
df$c <- "blue"
lim1 <- which(df$a == "apple")
lim2 <- which(df$a == "banana")
方法1:
df$c[lim1:lim2] <- "red"
方法2:
library(dplyr)
df <- df %>%
mutate(c = ifelse(row_number(a) %in% lim1:lim2, "blue", "red"))
答案 1 :(得分:1)
我们可以通过编程方式获取职位,然后进行分配
i1 <- Reduce(`:`, which(is.na(df1$b))+ c(1, -1))
df1$c <- 'blue'
df1$c[i1] <- 'red'
df1 <- structure(list(a = c("5", "6", "7", "2", "apple", "4", "1", "banana",
"6", "3"), b = c("yes", "no", "no", "yes", NA, "yes", "no", NA,
"yes", "yes")), .Names = c("a", "b"), class = "data.frame", row.names = c(NA,
-10L))
答案 2 :(得分:1)
首先,您的数据看起来像是矩阵而不是data.frame,如果您计划使用dplyr,则应该修复它。排序后,您可以在每个字词上使用cumsum
(如果您不想计算apple
行,则会滞后),减去,然后使用ifelse
转换{ {1}}和0
到1
和blue
:
red
答案 3 :(得分:0)
dplyr包提供row_number()
函数,可与mutate
和ifelse
结合使用,为特定行位置指定值:
library(dplyr)
df = df %>% mutate(c=ifelse(row_number(a) %in% c(6,7),"red","blue"))
答案 4 :(得分:0)
使用mutate和dplyr:
df %>% mutate(c = ifelse(row_number() %>% between(match("apple",a)+0.1,match("banana",a)-0.1),"red","blue"))
with base:
df <- transform(df,c = ifelse(1:nrow(df) > match("apple",a) & (1:nrow(df) < match("banana",a) ),"red","blue"))