找到每个组R中第一次出现的值

时间:2016-11-01 22:13:42

标签: r data.table dplyr

我有关于实验室中每台机器的数据集,

 MachineID InstalledDate SwitchedOnDate Status
 1           2010-02-18    2010-02-19    SleepMode
 1           2010-02-18    2010-02-20    Active
 1           2010-02-18    2010-02-21    SleepMode
 1           2010-02-18    2010-02-22    Active
 2           2010-02-20    2010-02-21    Active
 2           2010-02-20    2010-02-22    SleepMode
 3           2010-02-10    2010-02-18    SleepMode
 4           2010-03-10    2010-03-15    SleepMode

我试图找出每台机器首次从安装日期开始运行所需的天数。这就是" SwitchedOnDate - InstalledDate"。

2 个答案:

答案 0 :(得分:3)

data.table中,基本上是相同的想法:

library(data.table)
setDT(df) #convert to data.table

df[, SwitchedOnDate[which.max(Status == "Active")] - min(SwitchedonDate),
   by = MachineID]

如果您的输出中有一个名称(例如OffDuration),则会略有语法更改:

df[Status == "Active",
   .(OffDuration = 
       SwitchedOnDate[which.max(Status == "Active")] - min(SwitchedonDate)),
   by = MachineID]

答案 1 :(得分:2)

根据@ Gregor&@ Frank的评论,更好的方法是使用distinct仅保留每个MachineID的(第一个)唯一行,而不是按MachineID

library(dplyr)
res <- df %>% filter(Status=="Active") %>%
              distinct(MachineID, .keep_all=TRUE) %>%
              mutate(Days.Go.Active=difftime(SwitchedOnDate,InstalledDate,units="days"))
print(res)
##Source: local data frame [2 x 5]
##Groups: MachineID [2]
##
##  MachineID InstalledDate SwitchedOnDate Status Days.Go.Active
##      <int>        <date>         <date>  <chr> <S3: difftime>
##1         1    2010-02-18     2010-02-20 Active         2 days
##2         2    2010-02-20     2010-02-21 Active         1 days

使用dplyr,您可以mutate使用difftime来计算"days"单位的差异:

library(dplyr)
res <- df %>% group_by(MachineID) %>% 
              filter(Status=="Active") %>%
              filter(row_number()==1) %>%
              mutate(Days.Go.Active=difftime(SwitchedOnDate,InstalledDate,units="days"))
print(res)
##Source: local data frame [2 x 5]
##Groups: MachineID [2]
##
##  MachineID InstalledDate SwitchedOnDate Status Days.Go.Active
##      <int>        <date>         <date>  <chr> <S3: difftime>
##1         1    2010-02-18     2010-02-20 Active         2 days
##2         2    2010-02-20     2010-02-21 Active         1 days

在这里,我们group_by MachineID然后使用filter仅保留每个Status Active df <- structure(list(MachineID = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 4L), InstalledDate = structure(c(14658, 14658, 14658, 14658, 14660, 14660, 14650, 14678), class = "Date"), SwitchedOnDate = structure(c(14659, 14660, 14661, 14662, 14661, 14662, 14658, 14683), class = "Date"), Status = c("SleepMode", "Active", "SleepMode", "Active", "Active", "SleepMode", "SleepMode", "SleepMode")), .Names = c("MachineID", "InstalledDate", "SwitchedOnDate", "Status"), row.names = c(NA, -8L), class = "data.frame") ## MachineID InstalledDate SwitchedOnDate Status ##1 1 2010-02-18 2010-02-19 SleepMode ##2 1 2010-02-18 2010-02-20 Active ##3 1 2010-02-18 2010-02-21 SleepMode ##4 1 2010-02-18 2010-02-22 Active ##5 2 2010-02-20 2010-02-21 Active ##6 2 2010-02-20 2010-02-22 SleepMode ##7 3 2010-02-10 2010-02-18 SleepMode ##8 4 2010-03-10 2010-03-15 SleepMode 组的第一行。

数据:

res <- df %>% group_by(MachineID) %>%
              mutate(FirstSwitchedOnDate=first(SwitchedOnDate)) %>%
              filter(Status=="Active") %>%
              filter(row_number()==1) %>%
              mutate(Days.Go.Active=as.numeric(difftime(SwitchedOnDate,FirstSwitchedOnDate,units="days"))) %>%
              select(-FirstSwitchedOnDate)
##Source: local data frame [2 x 5]
##Groups: MachineID [2]
##
##  MachineID InstalledDate SwitchedOnDate Status Days.Go.Active
##      <int>        <date>         <date>  <chr>          <dbl>
##1         1    2010-02-18     2010-02-20 Active              1
##2         2    2010-02-20     2010-02-21 Active              0

解决使用第一个SwitchedOnDate

的更新要求
 .dropdown:hover .arrow4{
   -webkit-animation: spin 0.3s linear;
   -moz-animation: spin 0.3s linear;
   -o-animation: spin 0.3s linear;
   -ms-animation: spin 0.3s linear;
   animation-fill-mode: forwards;
 }
 @-webkit-keyframes spin {
   0% { -webkit-transform: rotate(0deg); }
   100% { -webkit-transform: rotate(90deg); }
 }