嵌套If语句日期

时间:2016-09-23 04:39:44

标签: r if-statement dataframe

我有一个数据框df,如下所示。

Id     ProcessDate
10     2011-12-29 14:14:00
11     2011-12-29 14:16:00
12     2011-12-29 14:14:00
13     2011-12-29 14:20:00
14     2011-12-29 14:49:00
15     2011-12-29 14:51:00
16     2011-12-29 14:53:00
17     2011-12-29 15:11:00
18     2011-12-29 15:13:00 
19     2011-12-29 15:10:00
20     2011-12-29 15:21:00
21     2011-12-29 14:34:00
22     2011-12-29 15:26:00  

我正在尝试根据此条件创建第三列Status,其中包含这三个值中的一个{Before, during , after }

 if  (df$ProcessDate < 2011-12-29 14:48:00)
 then  df$Status = "Before"
 else if (df$ProcessDate > 2011-12-29 14:48:00 & df$ProcessDate < 2011-12-29 15:16:00)
 then  df$Status = "Between"
 else  df$Status = "After"

最终的数据框应如下所示。

Id     ProcessDate              Status
10     2011-12-29 14:14:00      Before
11     2011-12-29 14:16:00      Before
12     2011-12-29 14:14:00      Before
13     2011-12-29 14:20:00      Before
14     2011-12-29 14:49:00      Between
15     2011-12-29 14:51:00      Between       
16     2011-12-29 14:53:00      Between
17     2011-12-29 15:11:00      Between
18     2011-12-29 15:13:00      Between
19     2011-12-29 15:10:00      Between
20     2011-12-29 15:21:00      After
21     2011-12-29 14:34:00      After
22     2011-12-29 15:26:00      After

我尝试了一些事情而且没有用,对此问题的任何帮助都非常感谢。

4 个答案:

答案 0 :(得分:6)

这可能是一种可能的解决方案

   Id         ProcessDate  Status
1  10 2011-12-29 14:14:00  Before
2  11 2011-12-29 14:16:00  Before
3  12 2011-12-29 14:14:00  Before
4  13 2011-12-29 14:20:00  Before
5  14 2011-12-29 14:49:00 Between
6  15 2011-12-29 14:51:00 Between
7  16 2011-12-29 14:53:00 Between
8  17 2011-12-29 15:11:00 Between
9  18 2011-12-29 15:13:00 Between
10 19 2011-12-29 15:10:00 Between
11 20 2011-12-29 15:21:00   After
12 21 2011-12-29 14:34:00  Before
13 22 2011-12-29 15:26:00   After

导致

NSDateFormatter

答案 1 :(得分:4)

子集分配

对于这种特殊情况,在基本R中执行此操作的一种非常简单的方法是将所有内容设置为'Between',然后使用子集赋值来更改应该是其他内容的行:

df$ProcessDate <- as.POSIXct(df$ProcessDate)    # skip if already parsed to datetime

df$Status <- 'Between'
df$Status[df$ProcessDate < as.POSIXct('2011-12-29 14:48:00')] <- 'Before'
df$Status[df$ProcessDate >= as.POSIXct('2011-12-29 15:16:00')] <- 'After'

df
##    Id         ProcessDate  Status
## 1  10 2011-12-29 14:14:00  Before
## 2  11 2011-12-29 14:16:00  Before
## 3  12 2011-12-29 14:14:00  Before
## 4  13 2011-12-29 14:20:00  Before
## 5  14 2011-12-29 14:49:00 Between
## 6  15 2011-12-29 14:51:00 Between
## 7  16 2011-12-29 14:53:00 Between
## 8  17 2011-12-29 15:11:00 Between
## 9  18 2011-12-29 15:13:00 Between
## 10 19 2011-12-29 15:10:00 Between
## 11 20 2011-12-29 15:21:00   After
## 12 21 2011-12-29 14:34:00  Before
## 13 22 2011-12-29 15:26:00   After

cut

这样做的目的是使用cut,它有cut.POSIXt方法。除了您想要的数据之外,它还需要在数据之前和之后使用断点,但这将为分类数据提供一个很好的因素。

df$Status <- cut(df$ProcessDate, 
                 breaks = c(min(df$ProcessDate), 
                          as.POSIXct(c('2011-12-29 14:48:00', '2011-12-29 15:16:00')), 
                          max(df$ProcessDate) + 1), 
                 labels = c('Before', 'Between', 'After'))

嵌套ifelse来电

最常见和最通用的基本版本是嵌套的ifelse调用,它们看起来很丑陋(特别是如果有很多),但是要快速评估,因为ifelse是向量化的,而{{1不是:

if

dplyr

df$Status <- ifelse(df$ProcessDate < as.POSIXct('2011-12-29 14:48:00'), 'Before', ifelse(df$ProcessDate < as.POSIXct('2011-12-29 15:16:00'), 'Between', 'After')) 是嵌套dplyr::case_when调用的不错替代品。它连续评估每个条件并返回相应的值:

ifelse

除了library(dplyr) df %>% mutate( ProcessDate = as.POSIXct(ProcessDate), # skip this line if already datetime # if this is true, then return "Before" Status = case_when(.$ProcessDate < as.POSIXct('2011-12-29 14:48:00') ~ 'Before', # for the rest, if this is true, return "Between" .$ProcessDate < as.POSIXct('2011-12-29 15:16:00') ~ 'Between', # always true, so make the rest "After" TRUE ~ 'After')) 之外,所有版本都返回相同的内容,后者返回一个因子而不是字符向量。

答案 2 :(得分:4)

试试这个:

left <- as.POSIXct("12/29/2011 14:48", format = "%m/%d/%Y %H:%M") 
right <- as.POSIXct("12/29/2011 15:16", format = "%m/%d/%Y %H:%M") 
DT[, Status := ifelse(ProcessDate < left, "before", 
            ifelse(ProcessDate > right, "after", "between"))]

它给出了:

    Id         ProcessDate  Status
 1: 10 2011-12-29 14:14:00  before
 2: 11 2011-12-29 14:16:00  before
 3: 12 2011-12-29 14:14:00  before
 4: 13 2011-12-29 14:20:00  before
 5: 14 2011-12-29 14:49:00 between
 6: 15 2011-12-29 14:51:00 between
 7: 16 2011-12-29 14:53:00 between
 8: 17 2011-12-29 15:11:00 between
 9: 18 2011-12-29 15:13:00 between
10: 19 2011-12-29 15:10:00 between
11: 20 2011-12-29 15:21:00   after
12: 21 2011-12-29 15:34:00   after
13: 22 2011-12-29 15:26:00   after

与上述相同的结果,可矢量化ifelse()data.table

答案 3 :(得分:0)

可能的解决方案之一是将您的时间转换为纪元值然后进行比较。 这可以通过使用as.integer(as.POSIXct(“Time”))来完成,如下所示

df = NULL
df$ids = c(10, 11, 12, 13, 14, 15, 16, 17, 18,  19, 20, 21, 22)      
df$date = c('2011-12-29 14:14:00', '2011-12-29 14:16:00', '2011-12-29      14:14:00', '2011-12-29 14:20:00', '2011-12-29 14:49:00', '2011-12-29 14:51:00', '2011-12-29 14:53:00', '2011-12-29 15:11:00', '2011-12-29 15:13:00', '2011-12-29 15:10:00', '2011-12-29 15:21:00', '2011-12-29 14:34:00', '2011-12-29 15:26:00')
df = as.data.frame(df)
df$date = as.integer(as.POSIXct(df$date))

upper   = as.integer(as.POSIXct('2011-12-29 15:16:00'))
lower   = as.integer(as.POSIXct('2011-12-29 14:48:00'))

您将转换日期列如下

> df
    ids       date
1   10 1325148240
2   11 1325148360
3   12 1325148240
4   13 1325148600
5   14 1325150340
6   15 1325150460
7   16 1325150580
8   17 1325151660
9   18 1325151780
10  19 1325151600
11  20 1325152260
12  21 1325149440
13  22 1325152560

然后您可以简单地执行数字比较

for(i in c(1:nrow(df))){
    if(df$date[i] < lower)
            df$Status[i] = "Before"
    else if(df$date[i] > lower & df$date[i] < upper)
            df$Status[i] = "Between"
    else
            df$Status[i] = "After"
}

导致输出

> df
    ids       date  Status
1   10 1325148240  Before
2   11 1325148360  Before
3   12 1325148240  Before
4   13 1325148600  Before
5   14 1325150340 Between
6   15 1325150460 Between
7   16 1325150580 Between
8   17 1325151660 Between
9   18 1325151780 Between
10  19 1325151600 Between
11  20 1325152260   After
12  21 1325149440  Before
13  22 1325152560   After