我有一个包含多个" \ n"的字符串。我想查看每一行并删除包含单词" banana"
的每一行样本DF:
farm_data <- data.frame(shop=c('fruit'),
sentence=c('the basket contains apples
bananas are the best
are we going to eat bananas
why not just boil the fruits
let us make some banana smoothie'), stringsAsFactors=FALSE)
我尝试过的事情:
farm_data$sentence <- gsub(".* bananas .* \n", "\n", farm_data$sentence)
我想要的是什么:
clean_data <- data.frame(shop=c('fruit'),
sentence=c('the basket contains apples
why not just boil the fruits'), stringsAsFactors=FALSE)
已删除包含香蕉的行。
感谢。
答案 0 :(得分:3)
x <- 'the basket contains apples
bananas are the best
are we going to eat bananas
why not just boil the fruits
let us make some banana smoothie'
cat(x)
# the basket contains apples
# bananas are the best
# are we going to eat bananas
# why not just boil the fruits
# let us make some banana smoothie
cat(gsub('.*banana.*\\n?', '', x, perl = TRUE))
# the basket contains apples
# why not just boil the fruits
答案 1 :(得分:1)
我可能以迂回的方式解决这个问题。我首先按换行符\n
拆分查询。
sentence <- unlist(strsplit(as.character(farm_data$sentence), '\n'))
之后,我删除了包含单词&#34; banana&#34;的结果分割中的那些元素。
cleanSentence <- sentence[-which(unlist(sapply(sentence, function(x){grep('banana',x)})==1))]
然后我使用paste
函数将它重新组合在一起。
clean_data <- data.frame(shop=c('fruit'),
sentence= paste(cleanSentence, collapse=' \n'), stringsAsFactors=FALSE)
希望这不是太火了。 :)
解决您对其他&#34;水果&#34;的可用性问题。或字符串:
cleanFruit <- function(fruit = 'banana'){
sentence <- unlist(strsplit(as.character(farm_data$sentence), '\n'))
cleanSentence <- sentence[-which(unlist(sapply(sentence, function(x){grep(fruit,x)})==1))]
clean_data <- data.frame(shop=c('fruit'),
sentence= paste(cleanSentence, collapse=' \n'), stringsAsFactors=FALSE)
return(clean_data)
}
将其写入函数,并将其交给给定的水果(或单词)。 @rawr的回答似乎有点清晰。