聚合中的多个功能

时间:2015-09-14 09:10:34

标签: r aggregate

可能来自以下数据框df1

 Branch Loan_Amount TAT
      A         100 2.0
      A         120 4.0
      A         300 9.0
      B         150 1.5
      B         200 2.0

我可以使用聚合函数将以下输出作为数据帧df2

 Branch Number_of_loans Loan_Amount Total_TAT
      A               3         520      15.0
      B               2         350       3.5

我知道我可以使用nrow来计算number_of_loans并合并,但我正在寻找更好的方法。

3 个答案:

答案 0 :(得分:3)

使用dplyr,你可以这样做:

library(dplyr)
group_by(d,Branch) %>% 
  summarize(Number_of_loans = n(),
            Loan_Amount = sum(Loan_Amount),
            TAT = sum(TAT))

输出

Source: local data frame [2 x 4]

  Branch Number_of_loans Loan_Amount   TAT
  (fctr)           (int)       (int) (dbl)
1      A               3         520  15.0
2      B               2         350   3.5

数据

d <- read.table(text="Branch Loan_Amount TAT
A         100 2.0
A         120 4.0
A         300 9.0
B         150 1.5
B         200 2.0",head=TRUE)

答案 1 :(得分:2)

使用data.table

library(data.table)
setDT(df)[,list(Number_of_loans=.N, 
                Loan_Amount    =sum(Loan_Amount), 
                Total_TAT      =sum(TAT)), by=Branch]
#    Branch Number_of_loans Loan_Amount Total_TAT
# 1:      A               3         520      15.0
# 2:      B               2         350       3.5

答案 2 :(得分:0)

这很笨拙且效率低下,但它有效且有趣(它使用aggregate()):

d <- read.table(text="Branch Loan_Amount TAT
A         100 2.0
A         120 4.0
A         300 9.0
B         150 1.5
B         200 2.0",head=TRUE)

library(stringr)
df = aggregate(.~Branch, data=d, FUN=function(x) paste0(length(x), '|',sum(x)))
df_ = cbind(str_split_fixed(df$Loan_Amount, '|', 4)[,c(2,4)], str_split_fixed(df$TAT, '|', 4)[,4])
df_ = apply(df_, 2, as.numeric)
colnames(df_) = c('Number_of_loans','Loan_Amount','Total_TAT')
cbind(df[,'Branch',drop=F], df_)

产生所需的数据。

  Branch Number_of_loans Loan_Amount Total_TAT
1      A               3         520      15.0
2      B               2         350       3.5