在R中搜索data.frame

时间:2014-09-12 01:59:51

标签: r

我的data.frame类似于以下简化版:

ddf
  id                  country          area
1  1 United States of America North America
2  2           United Kingdom        Europe
3  3     United Arab Emirates          Arab
4  4             Saudi Arabia          Arab
5  5                   Brazil South America

ddf = structure(list(id = 1:5, country = c("United States of America", 
"United Kingdom", "United Arab Emirates", "Saudi Arabia", "Brazil"
), area = c("North America", "Europe", "Arab", "Arab", "South America"
)), .Names = c("id", "country", "area"), class = "data.frame", row.names = c(NA, 
-5L))

我想要打印文本' america' (不区分大小写)来自

  1. 任何专栏
  2. 列名为' area'
  3. 行数和列数以及列名是可变的,所以我不能使用ddf [,1]等。

    我试过以下但不行:

    ddf[apply(ddf, 1, function(x) grepl('america',x, ignore.case=T) ),]
       id              country   area
    2   2       United Kingdom Europe
    3   3 United Arab Emirates   Arab
    NA NA                 <NA>   <NA>
    

4 个答案:

答案 0 :(得分:1)

这是使用qdap包的方法:

library(qdap)
Search(ddf, "america")

##   id                  country          area
## 1  1 United States of America North America
## 5  5                   Brazil South America

查看源代码,了解有关其工作原理的更多信息。

对于第二个请求......

Search(ddf, "america", "area")

答案 1 :(得分:1)

在基地R:

ddf[do.call(mapply,c(any,lapply(ddf,grepl,pattern="america",ignore.case=TRUE))),]

#  id                  country          area
#1  1 United States of America North America
#5  5                   Brazil South America

答案 2 :(得分:1)

 hasAm <-  sapply( ddf, grepl, patt="america", ignore.case=TRUE)
 ddf[ rowSums(hasAm) > 0 , ]
  id                  country          area
1  1 United States of America North America
5  5                   Brazil South America

第一个值hasAm只是一个逻辑图像&#39;数据帧,第二行通过逻辑索引传递任何存在TRUE的行。

答案 3 :(得分:1)

有人建议我取消删除这个答案,所以在这里。

另一种使用mapply

的方式
> m <- mapply(grep, "america", ddf, ignore.case = TRUE)
> ddf[unique(unlist(m)), ]
#   id                  country          area
# 1  1 United States of America North America
# 5  5                   Brazil South America

您也可以同样方式使用lapplysapply

> s <- sapply(ddf, grep, pattern = "america", ignore.case = TRUE)
> ddf[unique(unlist(s)), ]
#   id                  country          area
# 1  1 United States of America North America
# 5  5                   Brazil South America