Question

我有像这样的df（data1）

 internode_length treatment genotype
1           98.165       sun       B3
2          116.633       sun       B3
3          103.474       sun       B3
4          120.683       sun       B3
5          109.128       sun       B3
6          129.076       sun       B3

我想根据条件

为此df添加一个单独的列

for i in (1:nrow(data1)){
  if (data1$genotype == "B3") {
      data1$mutation = "wt"
} else if (data1$genotype == "ein9" & "ein194"){
      data1$mutation = "phyB"
} else {
      data1$mutation = "hy2"
}
}

但我收到此错误并发出警告，但也无效

Error: unexpected symbol in "for i"
>   if (data1$genotype == "B3") {
+       data1$mutation = "wt"
+ } else if (data1$genotype == "ein9"){
+       data1$mutation = "phyB"
+ } else {
+       data1$mutation = "hy2"
+ }
Warning message:
In if (data1$genotype == "B3") { :
  the condition has length > 1 and only the first element will be used
> }
Error: unexpected '}' in "}"

有任何解决此问题的建议吗？

Answer 1

您应该使用ifelse：

transform(data1,
          mutation = ifelse (genotype == "B3",  "wt",
          ifelse (genotype %in% c("ein9","ein194"),
                  "phyB", "hy2")))

#      internode_length treatment genotype mutation
# 1           98.165       sun       B3       wt
# 2          116.633       sun       B3       wt
# 3          103.474       sun     ein9     phyB
# 4          120.683       sun       B3       wt
# 5          109.128       sun   ein194     phyB
# 6          129.076       sun       A2      hy2

Answer 2

data.table替代方案。

玩具数据

#  internode_length treatment genotype
#            98.165       sun       B3
#           116.633       sun       B3
#           103.474       sun       B3
#           120.683       sun       B3
#           109.128       sun       B3
#           129.076       sun       B3
#           129.076       sun     ein9
#           129.076       sun   ein194
#           129.076       sun       XY

代码

library(data.table)
mydata[, new_col := ifelse(genotype == "B3", "wt",
                           ifelse(genotype %in% c("ein9", "ein194"), "phyB",
                                  "hy2")
)]
mydata

#    internode_length treatment genotype new_col
# 1:           98.165       sun       B3      wt
# 2:          116.633       sun       B3      wt
# 3:          103.474       sun       B3      wt
# 4:          120.683       sun       B3      wt
# 5:          109.128       sun       B3      wt
# 6:          129.076       sun       B3      wt
# 7:          129.076       sun     ein9    phyB
# 8:          129.076       sun   ein194    phyB
# 9:          129.076       sun       XY     hy2

Answer 3

您也可以在不使用ifelse

的情况下执行此操作

  v1 <- factor(df$genotype)
  v1
  #[1] B3     B3     ein9   B3     ein194 A2    
  #Levels: A2 B3 ein194 ein9

将级别更改为您想要的级别。此处ein194和ein9应为phyB。

  levels(v1) <-  c("hy2", "wt", "phyB", "phyB")
  df$new_column <- as.character(v1)
   df
  #   internode_length treatment genotype new_column
  #1           98.165       sun       B3         wt
  #2          116.633       sun       B3         wt
  #3          103.474       sun     ein9       phyB
  #4          120.683       sun       B3         wt
  #5          109.128       sun   ein194       phyB
  #6          129.076       sun       A2        hy2

数据

 df <- structure(list(internode_length = c(98.165, 116.633, 103.474, 
 120.683, 109.128, 129.076), treatment = c("sun", "sun", "sun", 
 "sun", "sun", "sun"), genotype = c("B3", "B3", "ein9", "B3", 
 "ein194", "A2")), .Names = c("internode_length", "treatment", 
"genotype"), row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame")

如何根据R中的条件添加列

3 个答案:

数据