使用特定规则创建用于均值计算的函数

时间:2015-12-30 20:03:12

标签: r aggregate apply mean

我需要使用特定规则创建一个均值计算函数,而不使用applyaggregate函数。我有3个变量,我想计算var3var2的每个变化的平均值,第一个和第二个变量3表示同一函数中var1的每个变化。这个有可能?我的代码是:

变量1

var1 <- sort(rep(LETTERS[1:3],10))

变量2

var2 <- rep(1:5,6)

变量3

var3 <- rnorm(30)

创建数据框

DB<-NULL
DB<-cbind(var1,var2,as.numeric(var3))
head(DB)

用于计算平均值的函数

mymean <- function(x, db=DB){

for (1:length(db[,1])){

if (db[,[i]] != db[,[i]]) {
mean(db[,[i]])
}
else (db[,[i]] == db[,[i]]) {
stop("invalid rule") 
}}

这里开始出现问题并且不起作用

由于 亚历山大

1 个答案:

答案 0 :(得分:1)

您似乎希望按群组获取资助。

要做到这一点,我会使用dplyr

library(dplyr)

db <- data.frame(var1 = sort(rep(LETTERS[1:3],10)), var2=rep(1:5,6), var3=rnorm(30))
db %>%
group_by(var1) %>%
summarise(mean_over_va1 = mean(var3))
  var1 mean_over_va1
1    A    0.07314416
2    B   -0.05983557
3    C   -0.03592565

db %>%
group_by(var2) %>%
summarise(mean_over_va2 = mean(var3))

  var2 mean_over_va2
 1    1 -0.4512942044
 2    2 -0.1331316802
 3    3  0.0821958902
 4    4 -0.0001081054
 5    5  0.4646429921

然而,从您的评论中看来,您似乎不想使用applyaggregate之类的任何基本R命令,因此我假设您可能不喜欢上述解决方案。

如果我不得不用蛮力做这件事做这样的事情:

db <- data.frame(var1 = sort(rep(LETTERS[1:3],10)), var2=rep(1:5,6), var3=rnorm(30), stringsAsFactors = FALSE)

#Obtaining Groups
group1 <- unique(db$var1)
group2 <- unique(db$var2)

#Obtaining Number of Different types of groups so I dont have to keep calling length
N1 <- length(group1)
N2 <- length(group2)

#Preallocating, not necessary but a good habit
res1 <- data.frame(group = group1, mean = rep(NA, N1))
res2 <- data.frame(group = group2, mean = rep(NA, N2))


#Looping over the group members rather than each row of data.  I like this approach because it relies more heavily on sub-setting than it does on iteration, which is always a good idea in R.
for (i in seq(1, N1)){
  res1[i,"mean"] <- mean(db[db$var1%in%group1[i], "var3"])
}

for (i in seq(1, N2)){
  res2[i,"mean"] <- mean(db[db$var2%in%group2[i], "var3"])
}

res <- list(res1, res2)