根据不同的组合在列表列表上应用函数列表

时间:2015-06-30 08:40:38

标签: r dataframe apply sapply

我的数据框df包含三个分类变量cat1cat2cat3和两个连续变量con1con2。我想根据列sd,{{列}的不同组合计算列meancon1列表中的函数列表con2cat1 1}},cat2。我已经明确地对所有不同组合进行了子集化。

cat3

我想编写一个函数来动态计算基于不同组合的列列表的函数列表。你能提一些建议吗?谢谢!

修改: 我已经使用# Random generation of values for categorical data set.seed(33) df <- data.frame(cat1 = sample( LETTERS[1:2], 100, replace=TRUE ), cat2 = sample( LETTERS[3:5], 100, replace=TRUE ), cat3 = sample( LETTERS[2:4], 100, replace=TRUE ), con1 = runif(100,0,100), con2 = runif(100,23,45)) # Introducing null values df$con1[c(23,53,92)] <- NA df$con2[c(33,46)] <- NA results <- data.frame() funs <- list(sd=sd, mean=mean) # calculation of mean and sd on total observations sapply(funs, function(x) sapply(df[,c(4,5)], x, na.rm=T)) # calculation of mean and sd on different levels of cat1 sapply(funs, function(x) sapply(df[df$cat1=='A',c(4,5)], x, na.rm=T)) sapply(funs, function(x) sapply(df[df$cat1=='B',c(4,5)], x, na.rm=T)) # calculation of mean and sd on different levels of cat1 and cat2 sapply(funs, function(x) sapply(df[df$cat1=='A' & df$cat2=='C' ,c(4,5)], x, na.rm=T)) . . . sapply(funs, function(x) sapply(df[df$cat1=='B' & df$cat2=='E' ,c(4,5)], x, na.rm=T)) # Similarly for the combinations of three cat variables cat1, cat2, cat3 获得了一些明智的建议。如果有人使用dplyr家庭功能提供建议会很好,因为它有助于在进一步的要求中使用它们(数据帧)。

1 个答案:

答案 0 :(得分:1)

这是一个简单的单行基础解决方案:

> do.call(cbind, lapply(funs, function(x) aggregate(cbind(con1, con2) ~ cat1 + cat2 + cat3, data = df, FUN = x, na.rm = TRUE)))
   sd.cat1 sd.cat2 sd.cat3  sd.con1   sd.con2 mean.cat1 mean.cat2 mean.cat3 mean.con1 mean.con2
1        A       C       B       NA        NA         A         C         B  25.52641  37.40603
2        B       C       B 32.67192  6.966547         B         C         B  46.70387  34.85437
3        A       D       B 31.05224  6.530313         A         D         B  37.91553  37.13142
4        B       D       B 23.80335  6.001468         B         D         B  59.75107  30.29681
5        A       E       B 22.79285  1.526472         A         E         B  38.54742  25.23007
6        B       E       B 32.92139  2.621067         B         E         B  51.56253  29.52367
7        A       C       C 26.98661  5.710335         A         C         C  36.32045  36.42465
8        B       C       C 20.22217  8.117184         B         C         C  60.60036  34.98460
9        A       D       C 33.39273  7.367412         A         D         C  40.77786  35.03747
10       B       D       C 12.95351  8.829061         B         D         C  49.77160  33.21836
11       A       E       C 33.73433  4.689548         A         E         C  55.53135  32.38279
12       B       E       C 25.38637  9.172137         B         E         C  46.69063  31.56733
13       A       C       D 36.12545  6.323929         A         C         D  48.34187  32.36789
14       B       C       D 30.01992  7.130869         B         C         D  53.87571  33.12760
15       A       D       D 15.94151 11.756115         A         D         D  35.89909  31.76871
16       B       D       D 10.89030  6.829829         B         D         D  22.86577  32.53725
17       A       E       D 24.88410  6.108631         A         E         D  47.32549  35.22782
18       B       E       D 12.73711  8.151424         B         E         D  33.95569  36.70167