因此,我有一个数据帧,其中出现了多个不同种类的事件,并且我想用mutate / ifelse填充一个“ new_name”空列。 基本上,我希望根据以下条件填充new_name: 如果状态为不接受,我希望new_name为“ valid_name”的值,并且如果状态为“接受”或不适用,则我希望new_name采用“ species”的值。 这是数据框结构的一个示例: ´´
species valid_name new_name status
1. Tilapia guineensis | NA | NA | NA
2. Tilapia zillii | Hippocampus trimaculatus | NA | unaccepted
3. Fundulus rubrifrons | Hippocampus trimaculatus | NA | unaccepted
4. Eutrigla gurnardus | Bougainvillia supercili | NA | accepted
5. Sprattus sprattus | NA | NA | NA
6. Gadus morhua | Aglantha digitale | NA | accepted
´´´
到目前为止,我尝试了以下操作:
df<-df%>%
mutate(new_name = ifelse(status=="unaccepted",valid_name,ifelse(status=="accepted" | is.na(status),species,NA)))
因此,此代码仅适用于不具有NA的“状态”值。否则,它只会忽略NA,而不会执行任何操作。这样数据帧就变成了这样的东西:
species valid_name new_name status
1. Tilapia guineensis | NA | Tilapia guineensis | NA
2. Tilapia zillii | Hippocampus trimaculatus | Hippocampus trimaculatus | unaccepted
3. Fundulus rubrifrons | Hippocampus trimaculatus | Hippocampus trimaculatus | unaccepted
4. Eutrigla gurnardus | Bougainvillia supercili | Eutrigla gurnardus | accepted
5. Sprattus sprattus | NA | Sprattus sprattus | NA
6. Gadus morhua | Aglantha digitale | Gadus morhua | accepted
预先感谢您的回答
答案 0 :(得分:1)
如果我们使用==
,请确保还添加is.na
以返回TRUE / FALSE,否则,NA仍为NA
library(dplyr)
df%>%
mutate(new_name = ifelse(status=="unaccepted" & !is.na(status),valid_name,
ifelse(status=="accepted" & !is.na(status),species,species)))
# species valid_name status new_name
#1 Tilapia guineensis <NA> <NA> Tilapia guineensis
#2 Tilapia zillii Hippocampus trimaculatus unaccepted Hippocampus trimaculatus
#3 Fundulus rubrifrons Hippocampus trimaculatus unaccepted Hippocampus trimaculatus
#4 Eutrigla gurnardus Bougainvillia supercili accepted Eutrigla gurnardus
#5 Sprattus sprattus <NA> <NA> Sprattus sprattus
#6 Gadus morhua Aglantha digitale accepted Gadus morhua
另一种选择是使用%in%
,它将为NA返回FALSE
df%>%
mutate(new_name = ifelse(status %in% "unaccepted" ,valid_name,
ifelse(status %in% "accepted",species, species)))
使用可复制的示例
v1 <- c('a', 'b', NA)
v1 == 'a'
#[1] TRUE FALSE NA ####
v1 %in% 'a'
#[1] TRUE FALSE FALSE
df <- structure(list(species = c("Tilapia guineensis", "Tilapia zillii",
"Fundulus rubrifrons", "Eutrigla gurnardus", "Sprattus sprattus",
"Gadus morhua"), valid_name = c(NA, "Hippocampus trimaculatus",
"Hippocampus trimaculatus", "Bougainvillia supercili", NA,
"Aglantha digitale"
), status = c(NA, "unaccepted", "unaccepted", "accepted", NA,
"accepted")), class = "data.frame", row.names = c(NA, -6L))
答案 1 :(得分:0)
我想使用case_when
中的dplyr
提供一种替代方法,它提供了一种很好而直观的语法:
library(dplyr)
df <- structure(list(species = c("Tilapia guineensis", "Tilapia zillii",
"Fundulus rubrifrons", "Eutrigla gurnardus", "Sprattus sprattus",
"Gadus morhua"), valid_name = c(NA, "Hippocampus trimaculatus",
"Hippocampus trimaculatus", "Bougainvillia supercili", NA,
"Aglantha digitale"
), status = c(NA, "unaccepted", "unaccepted", "accepted", NA,
"accepted")), class = "data.frame", row.names = c(NA, -6L))
df <- df %>%
mutate(new_name = case_when(
status == "unaccepted" ~ valid_name,
status == "accepted" | is.na(status) ~ species
))