删除R中数据框中的元素的一部分

时间:2014-07-14 22:26:41

标签: r

我有一个这样的数据框(DF):

   word
1  vet clinic New York 
2  super haircut Alabama 
3  best deal on dog drugs
4  doggy medicine Texas
5  cat healthcare
6  lizards that don't lie

我正在尝试获取结果数据框(仅删除地理名称)

  word 
1 vet clinic
2 super haircut
3 best deal on dog drugs 
4 doggy medicine
5 cat healthcare
6 lizards that don't lie

以下内容未保留地理名称后的剩余字词。

vec <- # vector of geo names
DF <-DF[!grepl(vec,DF$word),]

3 个答案:

答案 0 :(得分:2)

使用@Ari的变量和数据框,矢量化方法可以使用Reduce:

vec = c("New York", "Texas", "Alabama")
word = c("vet clinic New York", "super haircut Alabama", "best deal on dog drugs", "doggy medicine Texas", "cat healthcare", "lizards that don't lie")
df = data.frame(word=word)
df$word = as.character(df$word)

Reduce(function(a, b) gsub(b,"", a, fixed=T), vec, df$word)

[1] "vet clinic "            "super haircut "         "best deal on dog drugs" "doggy medicine "       
[5] "cat healthcare"         "lizards that don't lie"

答案 1 :(得分:1)

正如Henrik所说,如果您在帖子中提交了reproducible example,那将会很有帮助。我会在这里这样做:

vec = c("New York", "Texas", "Alabama")
word = c("vet clinic New York", "super haircut Alabama", "best deal on dog drugs", "doggy medicine Texas", "cat healthcare", "lizards that don't lie")
df = data.frame(word=word)
df$word = as.character(df$word)
df

                    word
1    vet clinic New York
2  super haircut Alabama
3 best deal on dog drugs
4   doggy medicine Texas
5         cat healthcare
6 lizards that don't lie

一般来说,R gurus更喜欢矢量化而不是for循环。但在这种情况下,我发现嵌套的for循环和stringr包是解决此问题的最简单方法。

library(stringr)
for(i in 1:nrow(df))
{
  for (j in 1:length(vec))
  {
    df[i, "word"] = str_replace_all(df[i, "word"], vec[j], "")
  }
}
df

                word
1            vet clinic 
2         super haircut 
3 best deal on dog drugs
4        doggy medicine 
5         cat healthcare
6 lizards that don't lie

我相信这段代码可以为您提供所需的结果。

答案 2 :(得分:1)

使用@Ari的例子,

  library(stringr) 
  df$word <- str_trim(gsub(paste(vec,collapse="|"),"", df$word))
  df$word
 #[1] "vet clinic"             "super haircut"          "best deal on dog drugs"
 #[4] "doggy medicine"         "cat healthcare"         "lizards that don't lie"