两组内两个组的相交

时间:2019-04-29 10:47:47

标签: r lapply

我有一个数据表,该表有两个组,“步骤”和“修订”。我想在每一步的修订之间找到一列“ xy”的交集。

我可以使用以下的reduce和相交函数在每个组中相交:

Reduce( intersect, dt[step==35, .( list( unique(xy) ) ), revision]$V1 )
Reduce( intersect, dt[step==125, .( list( unique(xy) ) ), revision]$V1 )

如何使用应用功能之一遍历数据表中的不同步骤?

数据表的输出:

structure(list(step = c(35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 
35L, 35L, 35L, 35L, 35L, 125L, 125L, 125L, 125L, 125L, 125L, 
125L, 125L, 125L, 125L, 125L, 125L, 125L), revision = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("A", "B", "C", 
"D"), class = "factor"), x = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 1L, 3L, 5L, 7L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
12L, 14L, 16L, 18L), y = c(15L, 15L, 15L, 15L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L), v1 = c(0.981164967, 0.357821411, 
0.384904093, 0.009365961, 0.009815099, 0.438105885, 0.863393809, 
0.573781691, 0.740820825, 0.641646552, 0.943973241, 0.22291045, 
0.570410137, 0.659222737, 0.292184924, 0.508645342, 0.453675215, 
0.269587944, 0.990680599, 0.313789353, 0.558325641, 0.26494047, 
0.589091867, 0.674127791, 0.25342589, 0.790942138), v2 = c(0.496975999, 
0.211650614, 0.000808426, 0.224731638, 0.009959817, 0.400215816, 
0.405663446, 0.729970951, 0.868291344, 0.089448303, 0.940810964, 
0.284396467, 0.620131798, 0.915335866, 0.30428197, 0.274177649, 
0.845965456, 0.045879344, 0.631042628, 0.716668545, 0.219162129, 
0.644671523, 0.0925127, 0.283416738, 0.382530979, 0.803268677
), xy = c("1_15", "2_15", "3_15", "4_15", "5_15", "6_15", "7_15", 
"8_15", "9_15", "1_15", "3_15", "5_15", "7_15", "11_15", "12_15", 
"13_15", "14_15", "15_15", "16_15", "17_15", "18_15", "19_15", 
"12_15", "14_15", "16_15", "18_15")), class = c("data.table", 
"data.frame"), row.names = c(NA, -26L), .internal.selfref = <pointer: 0x00000000000a1ef0>, index = structure(integer(0), "`__step`" = integer(0)))

预期输出:

[[1]]
[1] "1_15" "3_15" "5_15" "7_15"
[[2]]
[1] "12_15" "14_15" "16_15" "18_15"

2 个答案:

答案 0 :(得分:1)

我们可以在lapply的{​​{1}}值中使用unique并执行step操作

Reduce

答案 1 :(得分:1)

我们可以进行一组by个“步骤”并应用Reduce

library(data.table)
out <- dt[, Reduce(intersect, .SD[, .(list(unique(xy))), revision]$V1), step]

如果需要放在list

split(out$V1, out$step)
#$`35`
#[1] "1_15" "3_15" "5_15" "7_15"

#$`125`
#[1] "12_15" "14_15" "16_15" "18_15"