我试图索引我的数据框。有一列"日期"在我的数据中(来自" 01/05 / 2015"到" 31/05 / 2015"),我想创建另一列(1到31),表示区分日期。例如:
Date Indicator
01/05/2015 1
01/05/2015 1
02/05/2015 2
11/05/2015 3
11/05/2015 3
我如何轻松解决?
实际上,我还有另一个担心,如果我有一个专栏" ID"在" Date"之前,我只想为每个ID创建指标,如下所示:
ID Date Indicator
ID1 1992-02-27 1
ID1 1992-02-27 1
ID1 1992-01-14 2
ID1 1992-02-28 3
ID2 1992-02-01 1
ID2 1992-02-01 1
ID2 1992-02-01 1
ID2 1992-07-01 2
如何解决?我必须使用for循环?
谢谢!
答案 0 :(得分:1)
您可以按照以下方式执行此操作:
dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92", "02/01/92", "02/01/92")
df <- data.frame(Date = as.Date(dates, "%m/%d/%y"))
df$Indicator <- c(1,1+cumsum(diff(df$Date)!=0))
结果:
> df
Date Indicator
1 1992-02-27 1
2 1992-02-27 1
3 1992-01-14 2
4 1992-02-28 3
5 1992-02-01 4
6 1992-02-01 4
7 1992-02-01 4
编辑:
dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92", "02/01/92", "02/01/92")
ID <- c(rep("ID1",3), rep("ID2",4))
df <- data.frame(ID = ID, Date = as.Date(dates, "%m/%d/%y"))
my_index <- function(date) { c(1,1+cumsum(diff(date)!=0)) }
df$Indices <- do.call(c,tapply(df$Date, df$ID, my_index))