如果您的唯一列ID是字符,您如何使用聚合?
aggregate(data, list(data$colID), sum)
Error in Summary.factor(c(1L, 1L), na.rm = FALSE) :
sum not meaningful for factors
改为角色..
data$colID<-as.character(data$colID)
aggregate(data, list(data$colID), sum)
Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument
ddply I get a similar error.
Error in FUN(X[[1L]], ...) :
only defined on a data frame with all numeric variables
我只想通过colID聚合,我不想总结它。我想要所有其他列总和。
dput(data)
structure(list(colID = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), col1 = c(1, 0, 0, 0, 2), col2 = c(0,
1, 0, 2, 0), col3 = c(0, 0, 1, 0, 0), col4 = c(5, 5, 5, 7, 7)), .Names = c("colID",
"col1", "col2", "col3", "col4"), row.names = c(NA, -5L), class = "data.frame")
答案 0 :(得分:5)
这应该有效
aggregate(x = DF[, -1], by = list(DF$colID), FUN = "sum")
DF是您的data.frame
使用ddply
包
plyr
ddply(DF, .(colID), numcolwise(sum))
colID col1 col2 col3 col4
1 a 1 1 1 15
2 b 2 2 0 14
从acast
包
dcast
或reshape2
acast( melt(DF), variable ~ colID, sum) # a matrix
dcast( melt(DF), variable ~ colID, sum) # a data.frame
Using colID as id variables
a b
col1 1 2
col2 1 2
col3 1 0
col4 15 14
修改强>
使用ddply
。不太优雅,但它有效!
Sums <- ddply(DF[, -5], .(colID), numcolwise(sum))
Mean <- ddply(DF[, c(1,5)], .(colID), numcolwise(mean))[,-1]
cbind(Sums, col4_mean=Mean)
colID col1 col2 col3 col4_mean
1 a 1 1 1 5
2 b 2 2 0 7
答案 1 :(得分:3)
这是一个data.table
解决方案。 by列是因子还是字符无关紧要。
library(data.table)
DT <- as.data.table(data)
# Calculates the sum of columns col1, ..., col3
# and mean of col4
merge(DT[, lapply(.SD, sum),by = colID, .SDcols =paste0('col', 1:3)],
DT[, lapply(.SD, mean),by = colID, .SDcols ='col4'],
by = 'colID')
答案 2 :(得分:1)
library(reshape2)
melted_data <- melt(df, id.vars = "colID")
dcast(melted_data, colID ~ variable, sum)