如何在R中子集特定日期范围?

时间:2018-10-21 08:43:56

标签: r date

我正在尝试获取数据框的子集,该子集的日期范围和格式为“ 2016-10-01 00:00”。

下面是我当前的代码,但是会产生错误:

Warning messages:
1: In which(classsched$DateTime >= x & classsched$DateTime <= y) :
Incompatible methods ("Ops.factor", "Ops.Date") for ">="
2: In which(classsched$DateTime >= x & classsched$DateTime <= y) :
Incompatible methods ("Ops.factor", "Ops.Date") for "<="

我的代码:

dateFunction <- function(x,y){
  classsched[which(classsched$DateTime >= x & classsched$DateTime <=y)]
}

date1 = as.Date('2016-10-01 00:00', format="%Y-%m-%d %H:%M")
date2 = as.Date('2017-10-31 23:59', format="%Y-%m-%d %H:%M")

test <- dateFunction(date1, date2)

我的数据集:

 DateTime            Course    Professor-in-Time
2016-01-01 11:10    CS        Morgan
2016-10-03 12:16    Eng       Andrew
2017-05-05 13:17    Poetry    Jen
2018-04-15 14:11    Reading   Eugene
2018-05-20 15:21    Math      Matt

DateTime <- as.Date(c('2016-01-01 11:10','2016-10-03 12:16','2017-05-05
13:17', '2018-04-15 14:11', '2018-05-20 15:21'))
Course <- c('CS','Eng','Poetry', 'Reading', 'Math')
Professor-in-Time <- c('Morgan', 'Andrew', 'Jen', 'Eugene', 'Matt')
classsched <- dataframe(DateTime, Course, Professor-in-Time)

因此输出应为:

 DateTime            Course    Professor-in-Time
2016-10-03 12:16    Eng       Andrew
2017-05-05 13:17    Poetry    Jen

我最初基于堆栈溢出问题 Subset a dataframe between 2 dates 的代码。

1 个答案:

答案 0 :(得分:1)

这样的事情怎么样?

首先,您应该提供一些易于重现e.g use dput的数据。其次,您的日期格式混杂在一起。

class MyPet:

    def __repr__(self):
        return  "MyPet(name='{}', animal_type='{}', age={})".format(self.__name,
                                                                    self.__animal_type,
                                                                    self.age)

    def __str__(self):
        string = self.__name + ' ' + self.__animal_type + ' ' + str(self.age)
        return string

然后您可以执行以下操作:

library(tidyverse)
library(lubridate)

df <- tibble(DateTime = dmy_hm("01/1/2016 11:10", "03/10/2016 12:16", "05/05/2017 13:17", "15/04/2018 14:11", "20/05/2018 15:21"),
             Course = c("CS", "Eng", "Poetry", "Reading", "Maths"),
             Prof_in_time = c("Morgan", "Andrew", "Jen", "Eugene", "Matt"))
df #note typos in the date format in your data

这会过滤您的日期。这可能与您使用的日期格式不匹配。

或者您也可以这样做:

start <- dmy_hm("01/1/2016 00:00") #note you had different formats
end <- dmy_hm("31/10/2016 23:59")

df2 <- df %>% 
  filter(DateTime >= start & DateTime <= end)
df2