我有一个鸟类数据的数据框(data_test),其中同一物种的多个计数可以在同一位置(池塘或横断面)发生。我想汇总数据,因此每个地点每个物种有一个总计数。我缺少某些位置坐标的值,但我不希望这些记录从数据集中删除。 这是数据:
# Create data table
location <- c("pondA","pondA","transect1","pondB","pondB","transect2","pondC","transect3","pondD","transect4")
type <- c("ground","ground","air","ground","ground","air","ground","air","ground","air")
easting <- c(NA,NA,18264,NA,NA,46378,NA,86025,NA,46295)
northing <-c(NA,NA,96022,NA,NA,85766,NA,21233,NA,23090)
species <- c("NOPI","NOPI","SCAU","GWTE","GWTE","RUDU","NOPI","GADW","NOPI","MALL")
count <- c(10,23,50,1,2,43,12,3,7,9)
data_test <- data.frame(location=location,type=type,easting=easting,northing=northing,species=species,count=count)
data_test
当我使用聚合函数时,它会删除缺少东向和北向值的记录:
aggregate(count ~ species + location + type + easting + northing, data=data_test, FUN=sum)
结果:
species location type easting northing count
1 GADW transect3 air 86025 21233 3
2 MALL transect4 air 46295 23090 9
3 RUDU transect2 air 46378 85766 43
4 SCAU transect1 air 18264 96022 50
使用na.action = NULL不起作用,因为求和不在东向或北向字段上运行。我想要的是:
species location type easting northing count
1 NOPI pondA ground NA NA 33
2 SCAU transect1 air 18264 96022 50
3 GWTE pondB ground NA NA 3
4 RUDU transect2 air 46378 85766 43
5 NOPI pondC ground NA NA 12
6 GADW transect3 air 86025 21233 3
7 NOPI pondD ground NA BA 7
8 MALL transect4 air 46295 23090 9
任何帮助非常感谢。
答案 0 :(得分:4)
一条data.table
路线是将剩余的名称与c()
library(data.table)
dt <- as.data.table(data_test)
dt[, .(count = sum(count)), by = c(names(dt)[-6])]
# location type easting northing species count
# 1: pondA ground NA NA NOPI 33
# 2: transect1 air 18264 96022 SCAU 50
# 3: pondB ground NA NA GWTE 3
# 4: transect2 air 46378 85766 RUDU 43
# 5: pondC ground NA NA NOPI 12
# 6: transect3 air 86025 21233 GADW 3
# 7: pondD ground NA NA NOPI 7
# 8: transect4 air 46295 23090 MALL 9
答案 1 :(得分:2)
使用aggregate
一个选项是将分组列转换为&#39; factor&#39;并添加&#34; NA&#34;作为addNA
nm1 <- setdiff(names(data_test), 'count')
data_test[nm1] <- lapply(data_test[nm1], addNA)
aggregate(count~., data_test, FUN=sum)
# location type easting northing species count
#1 transect3 air 86025 21233 GADW 3
#2 pondB ground <NA> <NA> GWTE 3
#3 transect4 air 46295 23090 MALL 9
#4 pondA ground <NA> <NA> NOPI 33
#5 pondC ground <NA> <NA> NOPI 12
#6 pondD ground <NA> <NA> NOPI 7
#7 transect2 air 46378 85766 RUDU 43
#8 transect1 air 18264 96022 SCAU 50
或使用dplyr
library(dplyr)
data_test %>%
group_by_(.dots=nm1) %>%
summarise(count=sum(count))