df
数据集包含Direction
的值和度量Timecode
。我想生成:
Difference_begin_end
:最后一个时间码与每个sample_ID
中的第一个时间码之间相差Direction
Difference_begin_end_all
:是每个sample_ID
的第一行和最后一行之间相差Direction
的秒数。
这是df数据集:
df=structure(list(Sample_ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Direction = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), Timecode = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L), .Label = c("",
"17:02:10", "17:02:11", "17:02:12", "17:02:13", "17:02:15", "17:02:26",
"17:02:47", "17:02:48", "17:02:49", "17:02:50", "17:02:59", "17:03:02",
"17:03:03", "17:03:07", "17:03:10", "17:03:11"), class = "factor")), .Names = c("Sample_ID",
"Direction", "Timecode"), row.names = c(NA, 50L), class = "data.frame")
答案 0 :(得分:1)
如果时间变量属于字符或因子类,则需要通过as.POSIXct()
,strptime()
或lubridate
包中的其他类似函数进行转换。以下是我的解决方案:
library(dplyr)
df %>% group_by(Sample_ID) %>%
mutate(Timecode = as.POSIXct(Timecode, format = "%H:%M:%S"),
yellow = last(Timecode) - first(Timecode)) %>%
group_by(Sample_ID, Direction) %>%
mutate(red_purple = last(Timecode) - first(Timecode))
# # A tibble: 99 x 5
# # Groups: Sample_ID, Direction [4]
# Sample_ID Direction Timecode yellow red_purple
# <int> <int> <dttm> <time> <time>
# 1 1 0 2018-12-24 17:02:10 61 secs 5 secs
# 2 1 0 2018-12-24 17:02:10 61 secs 5 secs
# 3 1 0 2018-12-24 17:02:10 61 secs 5 secs
# 4 1 0 2018-12-24 17:02:10 61 secs 5 secs
# 5 1 0 2018-12-24 17:02:10 61 secs 5 secs
变量yellow
和red_purple
对应于问题中第二张照片的颜色( EDIT 2 )。