在列中查找值返回TRUE / FALSE

时间:2014-09-17 10:20:46

标签: r return-value where aggregate string-matching

如果这是我的代码:

df<-data.frame(speaker=c("nancyball","nancyball","wigglet","wigglet"),
               phrase=c("the cat is on the hat",
                        "the cat runs",
                        "the cat is under the bowl",
                        "a cat plays"))

prep.list<-c("on","under","in")

我想要一个新的列(df$kind)添加到df,其值TF prep.list中的一个词位于{{1} }}

必须有一个简单的方法来做到这一点。

更好的是,我希望df$phrase返回一些不同的东西,比如我还有:

df$kind

我会得到:

verb.list<-c("plays","sings","sits")

我试过了: "prep, F, prep, verb" ,它不会将我的列强制转换为矢量和 where() apply() grep(),但丢失了维度

2 个答案:

答案 0 :(得分:3)

你可以尝试:

  df$kind <- grepl(paste(prep.list, collapse="|"), df$phrase)
  df
  #   speaker                    phrase  kind
  #1 nancyball     the cat is on the hat  TRUE
  #2 nancyball              the cat runs FALSE
  #3   wigglet the cat is under the bowl  TRUE
  #4   wigglet               a cat plays FALSE


 indx1 <- grepl(paste(prep.list, collapse="|"), df$phrase)
 indx2 <- grepl(paste(verb.list, collapse="|"), df$phrase)

找到@ Jaap的回答后,我想你想要:

  df$kind <- c("F", "prep", "verb")[as.numeric(factor(1*indx1+2*indx2))] #updated based on comments from @alexis_laz
  df
  #  speaker                    phrase kind
 #1 nancyball     the cat is on the hat prep
 #2 nancyball              the cat runs    F
 #3   wigglet the cat is under the bowl prep
 #4   wigglet               a cat plays verb

更新

假设您有多个lists且多个listelement df$phrase匹配,则单向:

 new.list <- c("hat", "bowl", "howl")
 nm1 <- ls(pattern=".list")
 lst1 <- mget(nm1)
 indx2 <- sapply(names(lst1), function(x) {x1 <- gsub("\\..*", "", x)
                                indx <- grepl(paste(lst1[[x]], collapse="|"), df$phrase)
                                      c(NA, x1)[indx+1]})

   df$kind <- ifelse(rowSums(is.na(indx2))==ncol(indx2), "F", 
                apply(indx2, 1, function(x) paste(x[!is.na(x)], collapse="_")))

   df
   #   speaker                    phrase     kind
   #1 nancyball     the cat is on the hat new_prep
   #2 nancyball              the cat runs        F
   #3   wigglet the cat is under the bowl new_prep
   #4   wigglet               a cat plays     verb

答案 1 :(得分:1)

扩展@ akrun的回答,你可以将几个比较结合起来:

df$kind <- ifelse(grepl(paste(verb.list, collapse="|"), df$phrase), "verb",
                  ifelse(grepl(paste(prep.list, collapse="|"), df$phrase), "prep", F))

给出:

    speaker                    phrase  kind
1 nancyball     the cat is on the hat  prep
2 nancyball              the cat runs FALSE
3   wigglet the cat is under the bowl  prep
4   wigglet               a cat plays  verb

如果您明确需要F而不是FALSE,请将F替换为"F"语句中的ifelse