迭代一列每一行中的每个字符

时间:2018-11-08 19:29:13

标签: r regex apply

该列的示例为test <- c('apple #1930', 'apple #84555', 'apple A #33859', 'apple good', 'peach brand A - level 1 #8839', 'peach brand A - middle or not', 'peach brand A #2283')

我希望我的结果表为:

 Name           Description     Number
apple              NA           #1930
apple              NA           #84555
apple              A            #33859
apple             good            NA
peach brand A     level 1        #8839
peach brand A    middle or not      NA
peach brand A       NA           #2283

我已经尝试过

findiffs <- rle(test)

newdf <- data.frame(
                    firststring = test[cumsum(findiffs$length)],
                    secondstring = test[cumsum(findiffs$length)+1]
                    )

newdf <- newdf[-dim(newdf)[1],] 

但是它没有给我我想要的输出。

任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:0)

我猜每一列都有自己的定界字符。因此,您可能想尝试这样的事情:

test <- data.frame(orig = c('apple #1930', 'apple #84555', 'apple A #33859', 'apple good', 'peach brand A - level 1 #8839', 'peach brand A - middle or not', 'peach brand A #2283'))


test %>% separate(orig, into= c("a", "b"), sep = "[#]") %>%  separate(a, into=c("aa", "bb"), sep="[-]")


              aa             bb     b
1         apple            <NA>  1930
2         apple            <NA> 84555
3       apple A            <NA> 33859
4     apple good           <NA>  <NA>
5 peach brand A        level 1   8839
6 peach brand A   middle or not  <NA>
7 peach brand A            <NA>  2283