匹配字符串并将其粘贴到上面的行

时间:2017-03-17 20:46:44

标签: r regex

我有以下data.frame:

d <- data.frame(id = c(1:20),
                name = c("Paraffinole (CAS 8042-47-5)", "Pirimicarb", "Rapsol", "Thiacloprid", 
                     "Chlorantraniliprole", "Flonicamid", "Tebufenozid", "Fenoxycarb", 
                     "Bacillus thuringiensis subspecies", "aizawai Stamm AB", "Methoxyfenozide", 
                     "Acequinocyl", "lndoxacarb", "Acetamiprid", "Spirotet_r:amat", 
                     "Cydia pomonella Granulovirus", "mexikanischer Stamm", "lmidacloprid", 
                     "Spirodiclofen", "Pyrethrine"),
                desc = LETTERS[1:20])

名称列包含字符串'stamm'的两个条目。我想选择这些条目并将它们粘贴到之前的一列条目中,然后删除该行。因此,df$name[9]应该最终看起来像Bacillus thuringiensis subspecies__aizawai Stamm ABdf$name[16],如下所示:Cydia pomonella Granulovirus__mexikanischer Stamm。然后应删除d$name[c(10,17]

如何匹配字符串并将其粘贴到上面的行?

2 个答案:

答案 0 :(得分:1)

这个怎么样?

library(stringr)
d$name <- as.character(d$name)
where_stamm <- which(str_detect(d$name, "Stamm") == TRUE)
for (i in where_stamm) {
  d$name[i-1] <- paste(d$name[i-1], d$name[i], sep = '__')
}
d <- d[-where_stamm, ] 

> d$name[9]
[1] "Bacillus thuringiensis subspecies__aizawai Stamm AB"
> d$name[15]
[1] "Cydia pomonella Granulovirus__mexikanischer Stamm"

(注意&#34; Cydia pomonella ....&#34;现在将在第15位,因为我们删除了第10行)

答案 1 :(得分:1)

以下是使用dplyr的解决方案:

library(dplyr)
d %>% 
  mutate(
    to_delete = grepl("stamm", name, ignore.case = TRUE),
    name = if_else(lead(to_delete, default = FALSE), paste(name, lead(name), sep = "__"), 
                   as.character(name))
  ) %>% 
  filter(!to_delete) %>%
  select(- to_delete)
#    id                                                name desc
# 1   1                         Paraffinole (CAS 8042-47-5)    A
# 2   2                                          Pirimicarb    B
# 3   3                                              Rapsol    C
# 4   4                                         Thiacloprid    D
# 5   5                                 Chlorantraniliprole    E
# 6   6                                          Flonicamid    F
# 7   7                                         Tebufenozid    G
# 8   8                                          Fenoxycarb    H
# 9   9 Bacillus thuringiensis subspecies__aizawai Stamm AB    I
# 10 11                                     Methoxyfenozide    K
# 11 12                                         Acequinocyl    L
# 12 13                                          lndoxacarb    M
# 13 14                                         Acetamiprid    N
# 14 15                                     Spirotet_r:amat    O
# 15 16   Cydia pomonella Granulovirus__mexikanischer Stamm    P
# 16 18                                        lmidacloprid    R
# 17 19                                       Spirodiclofen    S
# 18 20                                          Pyrethrine    T