r中字符串的循环和条件

时间:2019-08-31 00:03:14

标签: r

我想在下面创建一个基于列的新条件:

if the `str` column only contains `A` then insert `A`
if the `str` column only contains `B` then insert `B`
if the `str` column only contains `A` and `B` then insert `AB`
df<-read.table(text="
ID   str
1    A
1    A
1    AA
1    ABB
2    BA 
2    BB", header=T)
ID   str   simplify_str
1    A        A
1    A        A
1    AA       A
1    ABB      AB
2    BA       AB
2    BB       B

3 个答案:

答案 0 :(得分:3)

tidyverse选项而言,您可以将dplyr::case_whenstringr::str_detect一起使用

library(dplyr)
library(stringr)
df %>%
    mutate(simplify_str = case_when(
        str_detect(str, "^A+$") ~ "A",
        str_detect(str, "^B+$") ~ "B",
        TRUE ~ "AB"))
#  ID str simplify_str
#1  1   A            A
#2  1   A            A
#3  1  AA            A
#4  1 ABB           AB
#5  2  BA           AB
#6  2  BB            B           

答案 1 :(得分:2)

使用data.frame:

As <- grep("A",df$str)
Bs <- grep("B",df$str)
df$simplify_str <- ""
df$simplify_str[As] <- paste0(df$simplify_str[As],"A")
df$simplify_str[Bs] <- paste0(df$simplify_str[Bs],"B")
df
  ID str simplify_str
1  1   A            A
2  1   A            A
3  1  AA            A
4  1 ABB           AB
5  2  BA           AB
6  2  BB            B

答案 2 :(得分:2)

R基中的一般解决方案,它拆分字符串并将unique字符按排序方式粘贴在一起。

df$simplify_str <- sapply(strsplit(as.character(df$str), ""), 
                   function(x) paste(unique(sort(x)), collapse = ""))

df
#  ID str simplify_str
#1  1   A            A
#2  1   A            A
#3  1  AA            A
#4  1 ABB           AB
#5  2  BA           AB
#6  2  BB            B