我想在数据帧列表中拆分数据帧。拆分它的原因是我们总是father
后跟mother
,后面跟offspring
。father
。但是,这些系列成员可能有多行(总是后续行。例如{1}}数字1在行1和行2中。在我的下面的例子中,我有两个家庭,然后我试图获得一个包含两个数据帧的列表。
我的意见:
df <- 'Chr Start End Family
1 187546286 187552094 father
3 108028534 108032021 father
1 4864403 4878685 mother
1 18898657 18904908 mother
2 460238 461771 offspring
3 108028534 108032021 offspring
1 71481449 71532983 father
2 74507242 74511395 father
2 181864092 181864690 mother
1 71481449 71532983 offspring
2 181864092 181864690 offspring
3 160057791 160113642 offspring'
df <- read.table(text=df, header=T)
因此,我的预期输出dfout[[1]]
将如下所示:
dfout <- 'Chr Start End Family
1 187546286 187552094 father
3 108028534 108032021 father
1 4864403 4878685 mother
1 18898657 18904908 mother
2 460238 461771 offspring
3 108028534 108032021 offspring'
dfout - read.table(text=dfout, header=TRUE)
答案 0 :(得分:1)
要将每个系列拆分为单独的数据框,您需要一个索引,指示一个系列的结束位置和另一个系列的开始位置。对于索引,我使用&#34;父亲&#34;作为变革点。但是我们不能简单地使用indx <- df$Family == "father"
,因为可能有多个父亲&#39;}连续的条目。相反,我们测试来自后代的切换位置&#39;到了父亲那里通过搜索它等于1的位置。
indx <- cumsum(c(1L, diff(df$Family == "father")) == 1L)
split(df, indx)
# $`1`
# Chr Start End Family
# 1 1 187546286 187552094 father
# 2 3 108028534 108032021 father
# 3 1 4864403 4878685 mother
# 4 1 18898657 18904908 mother
# 5 2 460238 461771 offspring
# 6 3 108028534 108032021 offspring
#
# $`2`
# Chr Start End Family
# 7 1 71481449 71532983 father
# 8 2 74507242 74511395 father
# 9 2 181864092 181864690 mother
# 10 1 71481449 71532983 offspring
# 11 2 181864092 181864690 offspring
# 12 3 160057791 160113642 offspring
答案 1 :(得分:0)
如果您发布了用于生成实际数据框的代码,那将会更有帮助。我没有时间重做所有内容,但我会在一般意义上向您展示它是如何工作的。
gender <- c("M","M","F","F","F","F","M","M","M","M","F","F")
values <- c(20,22,24,19,9,17,18,22,12,14,7,8)
fruit <- c("apple","pear","mango","mango","mango","apple","banana","banana","banana","mango","apple","apple")
df <- data.frame(gender, values, fruit)
> df
gender values fruit
1 M 20 apple
2 M 22 pear
3 F 24 mango
4 F 19 mango
5 F 9 mango
6 F 17 apple
7 M 18 banana
8 M 22 banana
9 M 12 banana
10 M 14 mango
11 F 7 apple
12 F 8 apple
split(df, df$gender)
$F
gender values fruit
3 F 24 mango
4 F 19 mango
5 F 9 mango
6 F 17 apple
11 F 7 apple
12 F 8 apple
$M
gender values fruit
1 M 20 apple
2 M 22 pear
7 M 18 banana
8 M 22 banana
9 M 12 banana
10 M 14 mango