如何删除R中特定行中的字符?

时间:2016-03-18 10:17:06

标签: r split dataframe

如果特定值在V1中,我想将它从下一行的V2列中删除。

输入:

         V1               V2
19 49426099      19 49469087
19 49469087      19 49426099
6 29910378       6 29910742 - 6 29911064 - 6 29911086 - 6 29911092 - 6 29911154 
6 29910742       6 29910378 - 6 29911064 - 6 29911086 - 6 29911092 - 6 29911154 
6 29911064       6 29910378 - 6 29910742 - 6 29911086 - 6 29911092 - 6 29911154 

我期待这个结果:

         V1               V2
19 49426099      19 49469087
19 49469087      
6 29910378       6 29910742 - 6 29911064 - 6 29911086 - 6 29911092 - 6 29911154 
6 29910742       6 29911064 - 6 29911086 - 6 29911092 - 6 29911154 
6 29911064       6 29911086 - 6 29911092 - 6 29911154 

1 个答案:

答案 0 :(得分:0)

以下是一个示例数据框:

d = data.frame(V1 = c('19 49426099','19 49469087','6 29910378','6 29910742','6 29911064'), V2 = c('19 49469087','19 49426099','6 29910742 - 6 29911064 - 6 29911086 - 6 29911092 - 6 29911154','6 29910378 - 6 29911064 - 6 29911086 - 6 29911092 - 6 29911154','6 29910378 - 6 29910742 - 6 29911086 - 6 29911092 - 6 29911154'))
d$V1 = as.character(d$V1)
d$V2 = as.character(d$V2)

这是一个解决方案:

r = sapply(0:(nrow(d) - 1), function(i){
gsub(pattern = paste(d[1:i, 'V1'], collapse = '|'),'', x = d[i+1, 'V2'])
})
r = gsub(pattern = ' -  - ', replacement = '', x = r) # remove the gaps where numbers were removed
r = gsub(pattern = '^ - ', replacement = '', x = r) # remove gaps at the beginning
r = gsub(pattern = ' - $', replacement = '', x = r) # in case you have gaps at the end
d$V2 = r