在R中订购带有新变量的日期

时间:2018-03-06 14:53:17

标签: r duplicates

我需要订购重复项(首次访问,第二次访问,第三次访问...作为另一个变量中的类别)。所以我可以设法让它们按变量排序。 例如:

tagof

我需要的是建立一个给我的东西:

ids<-c("a", "a", "b", "c", "c", "c")
Dates<-c("2012-06-09", "2012-05-17", "2012-07-13", "2012-08-25", "2013-04-11", "2014-11-03")
mydata<-data.frame(ids, Dates)
ids      Dates
a 2012-06-09
a 2012-05-17
b 2012-07-13
c 2012-08-25
c 2013-04-11
c 2014-11-03

然后:

ids      Dates  order
a 2012-06-09  First
a 2012-05-17 Second
b 2012-07-13  First
c 2012-08-25  First
c 2013-04-11 Second
c 2014-11-03  Third

4 个答案:

答案 0 :(得分:1)

使用基础R中的data.table包和reshape函数:

library(data.table)

setDT(mydata)
mydata[, order := 1:.N, by = ids]
mydata <- reshape(mydata, timevar = "order", idvar = "ids", direction = "wide")

setnames(mydata, "ids", "ids2")
setnames(mydata, "Dates.1", "first")
setnames(mydata, "Dates.2", "second")
setnames(mydata, "Dates.3", "third")

   ids2      first     second      third
1:    a 2012-06-09 2012-05-17         NA
2:    b 2012-07-13         NA         NA
3:    c 2012-08-25 2013-04-11 2014-11-03

答案 1 :(得分:1)

只有基数R,你可以这样做:

mydata$order <- ave(mydata$Dates, mydata$ids, FUN = seq_along)
reshape(mydata, idvar = "ids", timevar = "order", direction="wide")
#   ids    Dates.1    Dates.2    Dates.3
# 1   a 2012-06-09 2012-05-17       <NA>
# 3   b 2012-07-13       <NA>       <NA>
# 4   c 2012-08-25 2013-04-11 2014-11-03

答案 2 :(得分:0)

使用tidyverse,你可以这样做:

library(tidyverse)

mydata.ordered <- mydata %>% 
  mutate(Dates = as.Date(Dates)) %>% 
  group_by(ids) %>% 
  arrange(Dates) %>% 
  mutate(order = rep(c("First", "Second", "Third"), length.out = n()))


mydata.spread <- mydata.ordered %>% 
  spread(order, Dates)

# A tibble: 3 x 4
# Groups:   ids [3]
#   ids   First      Second     Third     
# * <fct> <date>     <date>     <date>    
# 1 a     2012-05-17 2012-06-09 NA        
# 2 b     2012-07-13 NA         NA        
# 3 c     2012-08-25 2013-04-11 2014-11-03

答案 3 :(得分:0)

使用dplyrtidyr库:

ordinal <- c("First", "Second", "Third")
ordinal.df <- data.frame(ordinal = seq_along(ordinal), order = ordinal)

group_by(mydata, ids) %>%
  mutate(ordinal = 1:n()) %>%
  left_join(ordinal.df) %>%
  select(-ordinal) %>%
  spread(key = "order", value = "Dates")


# A tibble: 3 x 4
# Groups:   ids [3]
  ids   First      Second     Third     
* <fct> <fct>      <fct>      <fct>     
1 a     2012-06-09 2012-05-17 NA        
2 b     2012-07-13 NA         NA        
3 c     2012-08-25 2013-04-11 2014-11-03