我有一个数据框Donations
,其中包含以下列:
head(Donations)
Gender $inYear04 $inYear05 $inYear06
M 19000 25000 7000
F 17000 15000 12000
F 10000 14000 10500
M 12000 19000 8000
M 2000 11000 18000
F 10500 16000 19500
以下是所需的输出:
Gender Count Percentage_Count Total_Donation Percentage_Donation Mean_Donation
M 51 0.5 500000 0.38 7000
F 49 0.5 800000 0.61 9000
输出列标签是使用$inYear04
,$inYear05
和$inYear06
列中的操作派生的。
aggregate()
是继续进行的最佳方式吗?
PS:我是R编程的新手
答案 0 :(得分:0)
尝试
library(dplyr)
library(tidyr)
df %>%
gather(key, value, -Gender) %>%
group_by(Gender) %>%
summarise(Count = n(), Percentage_Count = n() / nrow(.),
Total_Donation = sum(value),
Percentage_Donation = sum(value) / sum(.$value),
Mean_Donation = mean(value))
给出了:
# A tibble: 2 x 6
# Gender Count Percentage_Count Total_Donation Percentage_Donation Mean_Donation
# <fctr> <int> <dbl> <int> <dbl> <dbl>
#1 F 9 0.5 124500 0.5071283 13833.33
#2 M 9 0.5 121000 0.4928717 13444.44