如何提取与R中任何字符匹配的列名?

时间:2017-07-19 17:10:26

标签: r regex

我有一个名为t:

的数据框
dput(t)
structure(list(Server = structure(c(2L, 3L, 4L, 5L, 1L, 1L), .Label = c("", 
"Server1", "Server2", "Server3", "Server4"), class = "factor"), 
    Date = structure(c(2L, 3L, 4L, 5L, 1L, 1L), .Label = c("", 
    "7/17/2017 15:01", "7/17/2017 15:02", "7/17/2017 15:03", 
    "7/17/2017 15:04"), class = "factor"), Host_CPU = c(1.161323547, 
    6.966178894, 0.656402588, 0.555137634, NA, NA), UsedMemPercent = c(11.33, 
    11.38, 11.38, 11.38, NA, NA), MY_REPORTING_NYAPP = c(1.05, 
    0.65, 0.52, 0.32, NA, NA)), .Names = c("Server", "Date", 
"Host_CPU", "UsedMemPercent", "MY_REPORTING_NYAPP"), class = "data.frame", row.names = c(NA, 
-6L))

我需要能够grep列的名称,这些列可能包含由得分低的任何字符串。

例如,

app<-c("MY_NYAPP")

如果应用向量中的任何单词由&#34; _&#34;分隔,我需要grep并将其分配给var。

app1<-unlist(strsplit(app, "_"))

var<-grep(app1,names(t), value=TRUE)

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

如果我理解正确,如果输入为“MY_APP”,您想检查哪些列名包含“MY”和“APP”?

t = structure(list(Server = structure(c(2L, 3L, 4L, 5L, 1L, 1L), .Label = c("", 
                                                                        "Server1", "Server2", "Server3", "Server4"), class = "factor"), 
               Date = structure(c(2L, 3L, 4L, 5L, 1L, 1L), .Label = c("", 
                                                                      "7/17/2017 15:01", "7/17/2017 15:02", "7/17/2017 15:03", 
                                                                      "7/17/2017 15:04"), class = "factor"), Host_CPU = c(1.161323547, 
                                                                                                                          6.966178894, 0.656402588, 0.555137634, NA, NA), UsedMemPercent = c(11.33, 
                                                                                                                                                                                             11.38, 11.38, 11.38, NA, NA), MY_REPORTING_NYAPP = c(1.05, 
                                                                                                                                                                                                                                                  0.65, 0.52, 0.32, NA, NA)), .Names = c("Server", "Date", 
                                                                                                                                                                                                                                                                                         "Host_CPU", "UsedMemPercent", "MY_REPORTING_NYAPP"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                  -6L))

app<-c("MY_NYAPP")

app2 = unlist(strsplit(app,"_"))
colnames(t)[rowSums(sapply(app2, function(x) grepl(x,colnames(t))))==length(app2)]

返回:

[1] "MY_REPORTING_NYAPP"

希望这有帮助。