我一直在努力在两个特定数据帧之间合并列,并且还合并数据帧中的行并将它们的值相加。我想首先将表1中的“X”和“Y”列添加到表2的末尾。在表2中,重复了一些“城镇”如“城镇A”。我想合并这些行,同时在行中添加数据。
表1
| X| Y |
|Town|
|A | 21| 23|
|A | 21| 23|
|B | 21| 23|
|C | 21| 23|
|D | 21| 23|
|D | 21| 23|
|E | 21| 23|
|E | 21| 23|
|F | 21| 23|
|F | 21| 23|
表2
|Species A| Species B | Species C| Species D| Species E | Species F |
|Town|
|A | 21| 23| 15| 0 | 3 | 7|
|A | 21| 23| 15| 0 | 3 | 7|
|B | 21| 23| 15| 0 | 3 | 7|
|C | 21| 23| 15| 0 | 3 | 7|
|D | 21| 23| 15| 0 | 3 | 7|
|D | 21| 23| 15| 0 | 3 | 7|
|E | 21| 23| 15| 0 | 3 | 7|
|E | 21| 23| 15| 0 | 3 | 7|
|F | 21| 23| 15| 0 | 3 | 7|
|F | 21| 23| 15| 0 | 3 | 7|
我尝试使用的一些代码是c.bind和merge函数,我也尝试使用run.seq,如下所示:
run.seq <- function(x) as.numeric(ave(paste(x), x, FUN = seq_along))
L <- list(df1, df2)
L2 <- lapply(L, function(x) cbind(x, run.seq = run.seq("Town")))
out <- Reduce(function(...) merge(..., all = TRUE), L2)[-2]
哪个不太奏效。
哪种代码最适合此类合并/组合?
如果有帮助,我会附上下表的结构:
表1
structure(list(Town = c("A", "A", "B", "C", "D", "D", "E", "E", "F", "F"), Captured = c(168L, 16L, 243L, 12L, 17L, 15L, 7L, 233L, 14L, 12L), Proportion = c(0.23, 0.02, 0.33, 0.02, 0.02, 0.02, 0.01, 0.32, 0.02, 0.02)), class = "data.frame", .Names = c("Town", "Captured", "Proportion"), row.names = c(NA,-10L))
表2
structure(c(106L, 7L, 5L, 4L, 4L, 4L, 4L, 18L, 5L, 3L, 38L, 6L, 234L, 6L, 8L, 5L, 3L, 203L, 4L, 7L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 1L, 0L, 0L, 1L, 0L, 2L, 0L, 0L, 20L, 2L, 3L, 2L, 5L, 5L, 0L, 7L, 5L, 2L), .Dim = c(10L, 6L), .Dimnames = structure(list(Town = c("A", "A", "B", "C", "D", "D", "E", "E", "F", "F"), Species = c("funestus", "gambiae", "indeterminada", "outro", "pharoenois", "tenebrosus")), .Names = c("Town", "Species")), class = "table")
答案 0 :(得分:3)
最好先汇总然后合并/加入两个数据集。使用 table 2 的表格格式,您还可以将 reshape2 的melt
和dcast
函数与sum
一起用作聚合函数(生成数据框)然后与聚合的t1
数据框合并:
library(reshape2)
# aggragate 't1'
t1sum <- aggregate(.~Town, t1, sum)
# reshape and aggregate 't2'
t2sum <- dcast(melt(t2), Town ~ Species, fun.aggregate = sum)
# or with 'as.data.frame(t2)' instead of 'melt(t2)'
t2sum <- dcast(as.data.frame(t2), Town ~ Species, fun.aggregate = sum)
merge(t1sum, t2sum, by = 'Town')
给出:
Town Captured Proportion funestus gambiae indeterminada outro pharoenois tenebrosus
1 A 184 0.25 113 44 1 2 2 22
2 B 243 0.33 5 234 0 0 1 3
3 C 12 0.02 4 6 0 0 0 2
4 D 32 0.04 8 13 0 0 1 10
5 E 240 0.33 22 206 0 0 2 7
6 F 26 0.04 8 11 0 0 0 7
使用 data.table 包,您可以执行类似的操作:
library(data.table)
t1dt <- setDT(t1)[, lapply(.SD, sum), by = Town]
t2dt <- dcast(setDT(melt(t2)), Town ~ Species, fun.aggregate = sum)
t1dt[t2dt, on='Town']
使用过的数据:
t1 <- structure(list(Town = c("A", "A", "B", "C", "D", "D", "E", "E", "F", "F"), Captured = c(168L, 16L, 243L, 12L, 17L, 15L, 7L, 233L, 14L, 12L), Proportion = c(0.23, 0.02, 0.33, 0.02, 0.02, 0.02, 0.01, 0.32, 0.02, 0.02)), class = "data.frame", .Names = c("Town", "Captured", "Proportion"), row.names = c(NA,-10L))
t2 <- structure(c(106L, 7L, 5L, 4L, 4L, 4L, 4L, 18L, 5L, 3L, 38L, 6L, 234L, 6L, 8L, 5L, 3L, 203L, 4L, 7L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 1L, 0L, 0L, 1L, 0L, 2L, 0L, 0L, 20L, 2L, 3L, 2L, 5L, 5L, 0L, 7L, 5L, 2L), .Dim = c(10L, 6L), .Dimnames = structure(list(Town = c("A", "A", "B", "C", "D", "D", "E", "E", "F", "F"), Species = c("funestus", "gambiae", "indeterminada", "outro", "pharoenois", "tenebrosus")), .Names = c("Town", "Species")), class = "table")