我有一个像这样的数据框A:
CHR POS X Y
1 447892 0.994 0.994
1 651929 0.988 1.982
1 741566 0.982 2.964
1 741566+n
...
2 2000 0.347 0.347
2 3444 0.421 0.768
2 3444+m
...
在这里,观察值按CHR分组,而POS具有值的有序序列。 Y是X的累加和。对于CHR中的每一行,我想将POS分为两列N1和N2,以获得如下所示的结果:
CHR N1 N2 X Y
1 447892 651929 0.994 0.994
1 651929 741566 0.988 1.982
1 741566 741566+n 0.982 2.964
2 2000 3444 0.347 0.347
2 3444 3444+m 0.421 0.768
答案 0 :(得分:1)
一种选择是在按“ CHR”分组后采用“ POS”的lead
library(dplyr)
df1 %>%
group_by(CHR) %>%
transmute(X, Y, N1 = POS, N2 = lead(POS)) %>%
na.omit
# A tibble: 4 x 5
# Groups: CHR [2]
# CHR X Y N1 N2
# <int> <dbl> <dbl> <int> <int>
#1 1 0.994 0.994 447892 651929
#2 1 0.988 1.98 651929 741566
#3 1 0.982 2.96 741566 741566
#4 2 0.347 0.347 2000 3444
df1 <- structure(list(CHR = c(1L, 1L, 1L, 1L, 2L, 2L), POS = c(447892L,
651929L, 741566L, 741566L, 2000L, 3444L), X = c(0.994, 0.988,
0.982, 0.55, 0.347, 0.421), Y = c(0.994, 1.982, 2.964, 2.54,
0.347, 0.768)), class = "data.frame", row.names = c(NA, -6L))