R:使用levels()函数选择data.frame的名称

时间:2014-10-21 20:03:16

标签: r subset levels

我有两个data.frames inputfilelist,它们是下面代码的子集

input <- structure(list(NAME = structure(c(3L, 3L, 7L, 6L, 4L, 2L, 5L, 5L, 5L, 1L), .Label = c("Example2", "Example7", "Test", "Test2","Test3", "Test6", "Test77"), class = "factor"), REFERENCE = structure(c(2L,2L, 3L, 1L, 1L, 4L, 2L, 2L, 2L, 1L), .Label = c("EXAMPLE5", "REGION1", "REGION2", "REGION77"), class = "factor"), VALUE = structure(c(1L,1L, 2L, 3L, 4L, 6L, 5L, 5L, 5L, 7L), .Label = c("120", "13", "14", "65", "89", "B", "C"), class = "factor")), .Names = c("NAME", "REFERENCE", "VALUE"), class = "data.frame", row.names = c(NA,-10L))


filelist <- structure(list(NAME = structure(c(3L, 5L, 1L, 6L, 4L, 2L), .Label = c("","Example2", "Test", "Test2", "Test3", "Test6"), class = "factor"), REFERENCE = structure(c(3L, 3L, 1L, 2L, 2L, 2L), .Label = c("",  "EXAMPLE5", "REGION1"), class = "factor")), .Names = c("NAME","REFERENCE"), class = "data.frame", row.names = c(NA, -6L))

library(dplyr)
ana <- filelist %>% left_join(., input)

     NAME REFERENCE VALUE
1     Test   REGION1   120
2    Test3   REGION1    89
3                     <NA>
4    Test6  EXAMPLE5    14
5    Test2  EXAMPLE5    65
6 Example2  EXAMPLE5     C

然后写入不同的data.frames:

list2env(split(ana, f = ana$REFERENCE )[-1], .GlobalEnv)

这一切都运行正常,但我在下一步要做的是访问组REGION1的名称(结果稍后会成为我不想要所有不同项目的图例的一部分)在,只有那些选定的)。我试图用levels()命令

来做这件事
NAMES_REGION1 <- levels(REGION1$NAME)

但我的输出如下:

[1] ""         "Example2" "Test"     "Test2"    "Test3"    "Test6"

我希望输出的内容仅为"Test""Test3",因为只有它们才是群组REGION1的一部分。 任何想法为什么会发生这种情况?

1 个答案:

答案 0 :(得分:0)

由于REGION1是从较大的数据集生成的,因此您希望丢弃子集之后的多余级别。您可以使用droplevels

REGION1$NAME <- droplevels(REGION1$NAME)
levels(REGION1$NAME)
# [1] "Test"  "Test3"

请注意,REFERENCE列也包含已结转的级别。您可以使用

一次删除两列中的额外级别
REGION1[-3] <- lapply(REGION1[-3], droplevels) 

现在我们可以看到所有额外的关卡都消失了

lapply(REGION1[-3], levels)
# $NAME
# [1] "Test"  "Test3"
#
# $REFERENCE
# [1] "REGION1"