R:计算边际或行& col数据帧的总和

时间:2011-05-02 23:09:19

标签: r dataframe margins

我有一个如下所示的数据框:

         Flag1             Flag2    Type1 Type2  Type3
1        A                 FIRST      2    0       0
2        A                SECOND      1    9       0
3        A                 THIRD      3    7       0
4        A                FOURTH      9   18       0
5        A                 FIFTH      1   22       0
6        A                 SIXTH      1   13       0
7        B                 FIRST      0    0       0
8        B                SECOND      3    9       0
9        B                 THIRD      5   85       0
10       B                FOURTH      4   96       0
11       B                 FIFTH      3   40       0
12       B                 SIXTH      0   17       0

我需要总结一下,我的数据框最终看起来像这个

         Flag1             Flag2    Type1 Type2  Type3   Sum
1        A                 FIRST      2    0       0      2
2        A                SECOND      1    9       0     10 
3        A                 THIRD      3    7       0     10
4        A                FOURTH      9   18       0     27
5        A                 FIFTH      1   22       0     23
6        A                 SIXTH      1   13       0     14
7        B                 FIRST      0    0       0      0
8        B                SECOND      3    9       0     12
9        B                 THIRD      5   85       0     90
10       B                FOURTH      4   96       0    100
11       B                 FIFTH      3   40       0     43
12       B                 SIXTH      0   17       0     17 
13      (all)              FIRST      2    0       0      2
14      (all)             SECOND      4   18       0     22
15      (all)              THIRD      8   92       0    100
16      (all)             FOURTH     13  114       0    127
17      (all)              FIFTH      4   62       0     66
18      (all)              SIXTH      1   30       0     31
19       A                 (all)     17   68       0     86
20       B                 (all)     15  247       0    262
21      (all)              (all)     32  315       0    348

我在reshape2包中尝试过add_margins函数,没有用,它不像我想要的那样计算总和。我尝试过聚合,rowSums& colSums - 没有结果。

这里的任何帮助都会很棒。

由于

求和函数也需要添加前一个Flag2的和。像,

        Flag1             Flag2    Type1 Type2  Type3   Sum
1        A                 FIRST      2    0       0      2
2        A                SECOND      1    9       0     12 
3        A                 THIRD      3    7       0     22
4        A                FOURTH      9   18       0     49
5        A                 FIFTH      1   22       0     72
6        A                 SIXTH      1   13       0     86
7        B                 FIRST      0    0       0      0
8        B                SECOND      3    9       0     12
9        B                 THIRD      5   85       0    102
10       B                FOURTH      4   96       0    202
11       B                 FIFTH      3   40       0    245
12       B                 SIXTH      0   17       0    262 
13      (all)              FIRST      2    0       0      2
14      (all)             SECOND      4   18       0     24
15      (all)              THIRD      8   92       0    124
16      (all)             FOURTH     13  114       0    251
17      (all)              FIFTH      4   62       0    317
18      (all)              SIXTH      1   30       0    348
19       A                 (all)     17   68       0     85
20       B                 (all)     15  247       0    262
21      (all)              (all)     32  315       0    347

2 个答案:

答案 0 :(得分:5)

假设您有这样的数据,框架及其名称是dtable:

dt1 <- as.data.frame(addmargins(xtabs(Type1~Flag1+Flag2, data=dtable)))
dt2 <- as.data.frame(addmargins(xtabs(Type2~Flag1+Flag2, data=dtable)))
dt3 <- as.data.frame(addmargins(xtabs(Type3~Flag1+Flag2, data=dtable)))
names(dt1)[3] <- "Type1"
names(dt2)[3] <- "Type2"
names(dt3)[3] <- "Type3"

dt.all <- merge(merge(dt1,dt2), dt3)
dt.all$Sum <- with(dt.all, Type1+Type2+Type3)

我无法获得您想要的确切排序顺序,但这很接近:

levels(dt.all$Flag2) <-  c("FIRST", "SECOND", "THIRD", "FOURTH" ,"FIFTH", "SIXTH",  "Sum" ) 
dt.all[order(dt.all$Flag1, dt.all$Flag2), ]

   Flag1  Flag2 Type1 Type2 Type3 Sum
1      A  FIRST     1    22     0  23
2      A SECOND     2     0     0   2
3      A  THIRD     9    18     0  27
4      A FOURTH     1     9     0  10
5      A  FIFTH     1    13     0  14
7      A  SIXTH     3     7     0  10
6      A    Sum    17    69     0  86
8      B  FIRST     3    40     0  43
9      B SECOND     0     0     0   0
10     B  THIRD     4    96     0 100
11     B FOURTH     3     9     0  12
12     B  FIFTH     0    17     0  17
14     B  SIXTH     5    85     0  90
13     B    Sum    15   247     0 262
15   Sum  FIRST     4    62     0  66
16   Sum SECOND     2     0     0   2
17   Sum  THIRD    13   114     0 127
18   Sum FOURTH     4    18     0  22
19   Sum  FIFTH     1    30     0  31
21   Sum  SIXTH     8    92     0 100
20   Sum    Sum    32   316     0 348

答案 1 :(得分:2)

rowSums适合我(或者我错过了什么?)。

> my.df <- read.table(textConnection("         Flag1             Flag2    Type1 Type2  Type3
+ 1        A                 FIRST      2    0       0
+ 2        A                SECOND      1    9       0
+ 3        A                 THIRD      3    7       0
+ 4        A                FOURTH      9   18       0
+ 5        A                 FIFTH      1   22       0
+ 6        A                 SIXTH      1   13       0
+ 7        B                 FIRST      0    0       0
+ 8        B                SECOND      3    9       0
+ 9        B                 THIRD      5   85       0
+ 10       B                FOURTH      4   96       0
+ 11       B                 FIFTH      3   40       0
+ 12       B                 SIXTH      0   17       0
+ "))
Browse[2]> my.df
   Flag1  Flag2 Type1 Type2 Type3
1      A  FIRST     2     0     0
2      A SECOND     1     9     0
3      A  THIRD     3     7     0
4      A FOURTH     9    18     0
5      A  FIFTH     1    22     0
6      A  SIXTH     1    13     0
7      B  FIRST     0     0     0
8      B SECOND     3     9     0
9      B  THIRD     5    85     0
10     B FOURTH     4    96     0
11     B  FIFTH     3    40     0
12     B  SIXTH     0    17     0
Browse[2]> rowSums(my.df[3:5])
  1   2   3   4   5   6   7   8   9  10  11  12 
  2  10  10  27  23  14   0  12  90 100  43  17 
Browse[2]> my.df$Sum <- rowSums(my.df[3:5])
Browse[2]> my.df
   Flag1  Flag2 Type1 Type2 Type3 Sum
1      A  FIRST     2     0     0   2
2      A SECOND     1     9     0  10
3      A  THIRD     3     7     0  10
4      A FOURTH     9    18     0  27
5      A  FIFTH     1    22     0  23
6      A  SIXTH     1    13     0  14
7      B  FIRST     0     0     0   0
8      B SECOND     3     9     0  12
9      B  THIRD     5    85     0  90
10     B FOURTH     4    96     0 100
11     B  FIFTH     3    40     0  43
12     B  SIXTH     0    17     0  17