Question

我的数据框中有一个名为“ ID”的列，其中包含示例名称，例如：C1，C2，C3，... C20，O1，...，O20。

现在，我想创建一个名为“处理”的新列，在其中所有在“ ID”和“ Organic”中的数字前面都带有“ C”的情况下，我都用“常规”一词填充在名称前是“ O”。

示例

ID      treatment
C1      conventional
C2      conventional
O1      organic

我在这里发现了一个类似的问题，但是他们在那里使用了全部内容，而不仅仅是我需要的部分内容：Aggregate data in one column based on values in another column

Answer 1

类似

mydf$treatment=ifelse(substr(mydf$ID,1,1)=="C","Conventional","Organic")

？

Answer 2

也尝试使用tidyverse命名法，其中有一些非常强大的易于遵循的命令，并且不会对if_else问题造成犯规；

ID <- c('C1', 'C2', 'O1', 'CG18', 'OG20')
dat <- data.frame(ID)

require(tidyverse)

# use 'case_when' from dplyr and 'stringr' commands
dat <- dat %>% 
  mutate(
    treatment = case_when(

      # regex anchors 'C' at beginning followed by digits
      str_detect(ID, '^C\\d+') ~ 'conventional'  

      # regex anchors 'O' at beginning followed by digits
      , str_detect(ID, '^O\\d+') ~ 'organic'

      # regex to detect 'G'
      , str_detect(ID, 'G') ~ 'grassland'
      , TRUE ~ 'NA')
  )

Answer 3

您也可以这样：

df <- data.frame(ID = c("C1","C2","O1"))
lkp <- c(C = 'conventional', O = 'organic')
# or to answer your comment on other answer :
# lkp <- c(C = 'conventional', O = 'organic', CG = 'grassland', OG = 'grassland')
df$treatment <- lkp[gsub("\\d.*","",df$ID)]
df
#   ID    treatment
# 1 C1 conventional
# 2 C2 conventional
# 3 O1      organic

仅使用另一列中的部分内容来填充新列

3 个答案: