这是我第一次问一个问题,所以请耐心等待。
我的数据集(df)如下:
animal azimuth south distance
pb1 187.561 1 1.992
pb1 147.219 1 8.567
pb1 71.032 0 5.754
pb1 119.502 1 10.451
pb2 101.702 1 9.227
pb2 85.715 0 8.821
我想创建一个附加列(df$cumdist
),该列增加累计距离,但是要在每个单独的动物内并且仅在df$south==1
时才可以。我想用df$south==0
重设累计和。
这就是我想要的结果(手动完成):
animal azimuth south distance cumdist
pb1 187.561 1 1.992 1.992
pb1 147.219 1 8.567 10.559
pb1 71.032 0 5.754 0
pb1 119.502 1 10.451 10.451
pb2 101.702 1 9.227 9.227
pb2 85.715 0 8.821 0
这是我尝试实现累加的代码:
swim.az$cumdist <- cumsum(ifelse(swim.az$south==1, swim.az$distance, 0))
尽管df$south==0
时成功停止添加,但它不会重置。另外,我知道我需要将其嵌入for循环中以按动物子集化。
非常感谢!
答案 0 :(得分:4)
我们将“ south”与“ distance”(“ cumdist”)相乘,以将“ south”中与“ 0”相对应的“ distance”中的值更改为0,并按“ animal”分组,并通过累积逻辑向量(south == 0
)的总和,得到'cumdist'的cumsum
,ungroup
并删除不需要的列(grp
)
library(dplyr)
dfN %>%
mutate(cumdist = south * distance) %>%
group_by(animal, grp = cumsum(south == 0)) %>%
mutate(cumdist = cumsum(cumdist)) %>%
ungroup %>%
select(-grp)
# A tibble: 6 x 5
# animal azimuth south distance cumdist
# <chr> <dbl> <int> <dbl> <dbl>
#1 pb1 188. 1 1.99 1.99
#2 pb1 147. 1 8.57 10.6
#3 pb1 71.0 0 5.75 0
#4 pb1 120. 1 10.5 10.5
#5 pb2 102. 1 9.23 9.23
#6 pb2 85.7 0 8.82 0
或使用base R
with(dfN, ave(distance * south, animal, cumsum(!south), FUN = cumsum))
#[1] 1.992 10.559 0.000 10.451 9.227 0.000
dfN <- structure(list(animal = c("pb1", "pb1", "pb1", "pb1", "pb2",
"pb2"), azimuth = c(187.561, 147.219, 71.032, 119.502, 101.702,
85.715), south = c(1L, 1L, 0L, 1L, 1L, 0L), distance = c(1.992,
8.567, 5.754, 10.451, 9.227, 8.821)), class = "data.frame",
row.names = c(NA, -6L))
答案 1 :(得分:4)
library(data.table)
setDT(df)
df[, cumdist := south*cumsum(distance), .(animal, rleid(south))]
# animal azimuth south distance cumdist
# 1: pb1 187.561 1 1.992 1.992
# 2: pb1 147.219 1 8.567 10.559
# 3: pb1 71.032 0 5.754 0.000
# 4: pb1 119.502 1 10.451 10.451
# 5: pb2 101.702 1 9.227 9.227
# 6: pb2 85.715 0 8.821 0.000