可能来自以下数据框df1
Branch Loan_Amount TAT
A 100 2.0
A 120 4.0
A 300 9.0
B 150 1.5
B 200 2.0
我可以使用聚合函数将以下输出作为数据帧df2
Branch Number_of_loans Loan_Amount Total_TAT
A 3 520 15.0
B 2 350 3.5
我知道我可以使用nrow来计算number_of_loans并合并,但我正在寻找更好的方法。
答案 0 :(得分:3)
使用dplyr,你可以这样做:
library(dplyr)
group_by(d,Branch) %>%
summarize(Number_of_loans = n(),
Loan_Amount = sum(Loan_Amount),
TAT = sum(TAT))
输出
Source: local data frame [2 x 4]
Branch Number_of_loans Loan_Amount TAT
(fctr) (int) (int) (dbl)
1 A 3 520 15.0
2 B 2 350 3.5
数据
d <- read.table(text="Branch Loan_Amount TAT
A 100 2.0
A 120 4.0
A 300 9.0
B 150 1.5
B 200 2.0",head=TRUE)
答案 1 :(得分:2)
使用data.table
library(data.table)
setDT(df)[,list(Number_of_loans=.N,
Loan_Amount =sum(Loan_Amount),
Total_TAT =sum(TAT)), by=Branch]
# Branch Number_of_loans Loan_Amount Total_TAT
# 1: A 3 520 15.0
# 2: B 2 350 3.5
答案 2 :(得分:0)
这很笨拙且效率低下,但它有效且有趣(它使用aggregate()
):
d <- read.table(text="Branch Loan_Amount TAT
A 100 2.0
A 120 4.0
A 300 9.0
B 150 1.5
B 200 2.0",head=TRUE)
library(stringr)
df = aggregate(.~Branch, data=d, FUN=function(x) paste0(length(x), '|',sum(x)))
df_ = cbind(str_split_fixed(df$Loan_Amount, '|', 4)[,c(2,4)], str_split_fixed(df$TAT, '|', 4)[,4])
df_ = apply(df_, 2, as.numeric)
colnames(df_) = c('Number_of_loans','Loan_Amount','Total_TAT')
cbind(df[,'Branch',drop=F], df_)
产生所需的数据。
Branch Number_of_loans Loan_Amount Total_TAT
1 A 3 520 15.0
2 B 2 350 3.5