我正在尝试在我的数据框中创建一个新列(称为Error_1_Count),该列针对“名称”的每个不同值计算“错误类型1”在名为“错误”的列中出现的次数。下面是我想要的结果数据框示例。
我曾尝试根据错误创建一个带有赋值的循环(请参见下文),但是,计数在我的输出中不正确(仅导致0和1)。
请让我知道如何改善代码并确保仅针对“名称”的新值重置计数。谢谢!
Goal Result in Table
Name Error Error_1_Count
A Error Type 1 1
A Error Type 4 1
A Error Type 1 2
B Error Type 2 0
A Error Type 1 3
C Error Type 3 0
D Error Type 1 1
names <- unique(data.df$name)
count <- 0
for (i in names) {
data.df[data.df$name == i, data.df$error_1_count <- ifelse(data.df$error == 'Error Type 1', count + 1, count)]
}
#View(data.df)
#print(unique(data.df$error_1_count))
答案 0 :(得分:3)
您可以使用ave
和cumsum
。
x$Error_1_Count <- ave(x$Error == "Error Type 1", x$Name, FUN=cumsum)
x
# Name Error Error_1_Count
#1 A Error Type 1 1
#2 A Error Type 4 1
#3 A Error Type 1 2
#4 B Error Type 2 0
#5 A Error Type 1 3
#6 C Error Type 3 0
#7 D Error Type 1 1
数据:
x <- structure(list(Name = structure(c(1L, 1L, 1L, 2L, 1L, 3L, 4L), .Label = c("A",
"B", "C", "D"), class = "factor"), Error = structure(c(1L, 4L,
1L, 2L, 1L, 3L, 1L), .Label = c("Error Type 1", "Error Type 2",
"Error Type 3", "Error Type 4"), class = "factor")), row.names = c(NA,
-7L), class = "data.frame")
答案 1 :(得分:3)
与dplyr
library(dplyr)
df1 %>%
group_by(Name) %>%
mutate(Error = cumsum(Error == "Error Type 1"))
df1 <- structure(list(Name = c("A", "A", "A", "B", "A", "C", "D"), Error = c("Error Type 1",
"Error Type 4", "Error Type 1", "Error Type 2", "Error Type 1",
"Error Type 3", "Error Type 1")), row.names = c(NA, -7L), class = "data.frame")