我有与单独识别的鲸鱼进行旅游互动的数据,其中我有鲸鱼身份证,遭遇日期和遭遇时间
Id Date Time
A 20110527 10:42
A 20110527 11:24
A 20110527 11:52
A 20110603 10:29
A 20110603 10:59
B 20110503 11:23
B 20110503 11:45
B 20110503 12:05
B 20110503 12:17
我现在想添加其他列,标明每个人每次遭遇的日期和当天的遭遇次数,如下所示:
Id Date Time Day Encounter
A 20110527 10:42 1 1
A 20110527 11:24 1 2
A 20110527 11:52 1 3
A 20110603 10:29 2 1
A 20110603 10:59 2 2
B 20110503 11:23 1 1
B 20110503 11:45 1 2
B 20110503 12:05 1 3
B 20110503 12:17 1 4
这可能吗?任何帮助将不胜感激!
答案 0 :(得分:2)
我们可以使用data.table
。转换' data.frame'到' data.table' (setDT(df1)
),按" Id"分组,我们match
'日期' unique
的值为'日期'创造“一天”#39;柱。然后,我们按照“Id'”,“'日期'并将(:=
)行的顺序分配给" Encounter"。
library(data.table)
setDT(df1)[, Day:= match(Date, unique(Date)), by = Id
][, Encounter := seq_len(.N), by = .(Id, Date)]
df1
# Id Date Time Day Encounter
#1: A 20110527 10:42 1 1
#2: A 20110527 11:24 1 2
#3: A 20110527 11:52 1 3
#4: A 20110603 10:29 2 1
#5: A 20110603 10:59 2 2
#6: B 20110503 11:23 1 1
#7: B 20110503 11:45 1 2
#8: B 20110503 12:05 1 3
#9: B 20110503 12:17 1 4
df1 <- structure(list(Id = c("A", "A", "A", "A", "A",
"B", "B", "B",
"B"), Date = c(20110527L, 20110527L, 20110527L,
20110603L, 20110603L,
20110503L, 20110503L, 20110503L, 20110503L),
Time = c("10:42",
"11:24", "11:52", "10:29", "10:59", "11:23", "11:45", "12:05",
"12:17")), .Names = c("Id", "Date", "Time"),
class = "data.frame", row.names = c(NA, -9L))
答案 1 :(得分:1)
这是一个reproducible示例:
df <- structure(list(
Id = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
.Label = c("A", "B"), class = "factor"),
Date = c(20110527L, 20110527L, 20110527L, 20110603L,
20110603L, 20110503L, 20110503L,
20110503L, 20110503L),
Time = structure(c(2L, 5L, 7L, 1L, 3L, 4L, 6L, 8L, 9L),
.Label = c("10:29", "10:42", "10:59", "11:23", "11:24", "11:45", "11:52", "12:05", "12:17"), class = "factor")),
.Names = c("Id", "Date", "Time"), class = "data.frame", row.names = c(NA, -9L))
然后可以使用dplyr
和
library(dplyr)
group_by(df, Id, Date) %>% mutate(Encounter=1:n()) %>% ungroup()
Source: local data frame [9 x 4]
Id Date Time Encounter
(fctr) (int) (fctr) (int)
1 A 20110527 10:42 1
2 A 20110527 11:24 2
3 A 20110527 11:52 3
4 A 20110603 10:29 1
5 A 20110603 10:59 2
6 B 20110503 11:23 1
7 B 20110503 11:45 2
8 B 20110503 12:05 3
9 B 20110503 12:17 4
答案 2 :(得分:1)
或使用ave
和by
进行基础R:
我使用了Vincent Bonhomme发布的数据(数据应按日期和标识排序):
# Function to count the days per individual using factor levels
foo <- function(x){as.numeric(as.character(factor(x,labels = 1:nlevels(factor(x)))))}
# Add the columns Day & Encounter
df$Day <-unlist(by(df$Date,list(df$Id),FUN=foo))
df$Encounter <- ave(1:nrow(df),list(df$Id,df$Date),FUN=seq_along)