我试图基于来自包含特定字符的不同变量的观察结果来创建观察结果。我尝试了以下代码:
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation)
df$diveLocation<-as.character(df$diveLocation)
df$diveLocation<- gsub("^C\\w+", "compliance", df$site)
head(df)
哪个给:
site year diveLocation
1 5.1 2011 5.1
2 CD 1.1 2013 compliance 1.1
3 FD 1 2010 FD 1
4 FD 2 2010 FD 2
5 FD 3 2010 FD 3
6 FD 4 2010 FD 4
唯一的好处是合规性已经填充了“ diveLocation”,但是,我只希望合规性(即不使用“站点”观测值1.1),并且我不希望遇到其他所有“站点”观测值到“ diveLocation”(例如5.1等),而不是仅填充NA。任何建议将不胜感激!
答案 0 :(得分:0)
此代码应为您完成工作。
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation)
df$diveLocation <- ifelse(substr(df$site, 1, 1) == "C", "compliance", ifelse(substr(df$site, 1, 1) == "F", "Farm","NA"))
答案 1 :(得分:0)
我们可以使用grep
创建一个数字索引。根据索引对“站点”进行子集,将值分配给“ diveLocation”的相应元素
i1 <- grep("^CD", df$site)
df$diveLocation[i1] <- 'compliance'
df
# site year diveLocation
#1 5.1 2011 <NA>
#2 CD 1.1 2013 compliance
#3 FD 1 2010 <NA>
#4 FD 2 2010 <NA>
#5 FD 3 2010 <NA>
#6 FD 4 2010 <NA>
#7 FD 5 2010 <NA>
#8 FD 6 2010 <NA>
i2 <- grep("^FD", df$site)
df$diveLocation[i2] <- 'Farm'
或使用data.table
library(data.table)
setDT(df)[grep("^CD", site), diveLocation := 'compliance'][]
答案 2 :(得分:0)
使用tidyverse软件包以及case_when和str_detect的组合
library(tidyverse)
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation) %>%as_tibble()
new_df <- df %>%
mutate(diveLocation = case_when(
str_detect(site,pattern = "C") ~ "compliance",
str_detect(site, pattern = "F") ~"farm",
TRUE ~ NA_character_
))
new_df