我有一个数据框,我想在该数据框的基础上按一个组进行拆分,并希望对其进行拆分,因此我可以封装从Valley 1到Valley 2,Valley 2到Valley 3等的所有内容。
Time Peaks ID
1 0.00 Data 1
2 0.36 Data 2
3 0.75 Valley 1
4 1.14 Peak 1
5 1.54 Data 3
6 1.93 Data 4
7 2.32 Valley 2
8 2.72 Peak 2
9 3.12 Valley 3
Desired output:
df1
3 0.75 Valley 1
4 1.14 Peak 1
5 1.54 Data 3
6 1.93 Data 4
7 2.32 Valley 2
df2
7 2.32 Valley 2
8 2.72 Peak 2
9 3.12 Valley 3
可直接在R中使用的格式数据:
dat <- structure(list(Row = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16), Time = c(0, 0.36, 0.75, 1.14, 1.54, 1.93, 2.32,
2.72, 3.12, 3.51, 3.9, 4.3, 4.69, 5.08, 5.47, 5.87), Value = c(455,
456, 456, 459, 456, 456, 455, 458, 455, 458, 455, 456, 461, 458,
458, 459), Peaks = c("Data", "Data", "Valley", "Peak", "Data",
"Data", "Valley", "Peak", "Valley", "Peak", "Valley", "Data",
"Peak", "Data", "Valley", "Data"), Peak_id = c(1, 2, 1, 1, 3,
4, 2, 2, 3, 3, 4, 5, 4, 6, 5, 7)), row.names = c(NA, -16L), class = c("tbl_df",
"tbl", "data.frame"))
答案 0 :(得分:2)
尝试以下代码,希望对您有所帮助:
idx <- with(df,which(Peaks == "Valley"))
mapply(function(k1,k2) df[k1:k2,], idx[-length(idx)],idx[-1],SIMPLIFY = F)
收益:
[[1]]
Time Value Peaks Peak_id
3 0.75 456 Valley 1
4 1.14 459 Peak 1
5 1.54 456 Data 3
6 1.93 456 Data 4
7 2.32 455 Valley 2
[[2]]
Time Value Peaks Peak_id
7 2.32 455 Valley 2
8 2.72 458 Peak 2
9 3.12 455 Valley 3
[[3]]
Time Value Peaks Peak_id
9 3.12 455 Valley 3
10 3.51 458 Peak 3
11 3.90 455 Valley 4
[[4]]
Time Value Peaks Peak_id
11 3.90 455 Valley 4
12 4.30 456 Data 5
13 4.69 461 Peak 4
14 5.08 458 Data 6
15 5.47 458 Valley 5
答案 1 :(得分:1)
结合使用cumsum
和split
函数非常简单:
bsp = as.data.frame(list(Time = c(0, 0.36, 0.75, 1.14, 1.54, 1.93, 2.32, 2.72, 3.12, 3.51, 3.9, 4.3, 4.69, 5.08, 5.47, 5.87),
Value = c(455L, 456L, 456L, 459L, 456L, 456L, 455L, 458L, 455L, 458L, 455L, 456L, 461L, 458L, 458L, 459L),
Peaks = c("Data", "Data", "Valley", "Peak", "Data", "Data", "Valley", "Peak", "Valley",
"Peak", "Valley", "Data", "Peak", "Data", "Valley", "Data"),
Peak_id = c(1L, 2L, 1L, 1L, 3L, 4L, 2L, 2L, 3L, 3L, 4L, 5L, 4L, 6L, 5L, 7L)))
bsp$group_id = cumsum(bsp$Peaks == 'Valley')
split(bsp, by = "group_id", keep.by = FALSE)