data.table子集不返回指定范围之间的所有值

时间:2014-03-14 15:02:31

标签: r data.table subset

我在data.table中有一个数据集,其日期范围我已分为年,月和日。我试图按月分配它。

以下是我用于子集

的代码
ADVDT6 <- subset(ADVDT, mo %in% 7:12)

我的数据在这里是可重复的格式:

ADVDTtest <-
         data.table(structure(list(Issue.Date
         = structure(c(16041,
         16056,
         16067,16011,
         16042, 15859,
         15862, 16021,
         16023, 16011),
         class =
         "Date"),Action
         = structure(c(4L,
         4L, 1L, 1L,
         4L, 2L, 2L,
         1L, 3L,1L),
         .Label =
         c("Internal
         Complaint",
         "Make Good",
         "Make Good -
         Prod","Nothing
         Given", "Pending"), class = "factor"),
         yr =
         c("2013","2013",
         "2013", "2013",
         "2013", "2013",
         "2013", "2013", "2013","2013"),
         mo = c("12",
         "12", "12",
         "11", "12", "06", "06","11", "11", "11"),
         da = c("02",
         "17", "28",
         "02", "03",
         "03","06",
         "12", "14",
         "02")), .Names
         = c("Issue.Date",
         "Action","yr",
         "mo", "da"),
         class =
         c("data.table",
         "data.frame"),
         row.names = c(NA,-10L)))

我的问题是,当我使用我的代码进行子集时,我只收到10-12个月而不是7-12个的输出。

我的最终目标是能够使用ggplot2绘制一个条形图,忽略我在原生条形图中遇到的NA问题。但是,我无法获得想要绘制的完整数据子集!

非常感谢!

2 个答案:

答案 0 :(得分:1)

我认为这就是你所追求的。 您的数据样本:

dd = read.table(textConnection("Issue.Date Action yr mo da
12/2/13 Nothing Given   2013    12      2
12/17/13        Nothing Given   2013    12      17
12/28/13        Internal Complaint      2013    12      28
11/2/13 Internal Complaint      2013    11      2
12/3/13 Nothing Given   2013    12      3
6/3/13  Make Good       2013    6       3"),header=T)

在7到12之间的数据框中选择

dd[which(dd$mo>6&dd$mo<13),]

结果

> dd[which(dd$mo>6&dd$mo<13),]
         Issue.Date    Action   yr mo da
12/2/13     Nothing     Given 2013 12  2
12/17/13    Nothing     Given 2013 12 17
12/28/13   Internal Complaint 2013 12 28
11/2/13    Internal Complaint 2013 11  2
12/3/13     Nothing     Given 2013 12  3

答案 1 :(得分:1)

ADVDT6 <- subset(ADVDT, mo %in% 7:12)也应该是正确的。对我而言,它适用于Chargaff使用的一部分数据:

    > subset(dd, mo %in% 11:12)
            Issue.Date    Action   yr mo da
    12/2/13     Nothing     Given 2013 12  2
    12/17/13    Nothing     Given 2013 12 17
    12/28/13   Internal Complaint 2013 12 28
    11/2/13    Internal Complaint 2013 11  2
    12/3/13     Nothing     Given 2013 12  3

    > subset(dd, mo %in% 6:11)
            Issue.Date    Action   yr mo da
    11/2/13   Internal Complaint 2013 11  2
    6/3/13        Make      Good 2013  6  3

也许您应该检查整个data.frame是否看起来像dd