Question

我有一个名为Cust_Amount的数据框，如下所示：

Age    Amount_Spent
25       20
43       15
32       27
37       10
45       17
29       10

我想将其划分为相同规模的年龄组，并将每个年龄组的花费总和如下所示：

Age_Group  Total_Amount
 20-30     30
 30-40     37
 40-50     32

Answer 1

我们可以使用cut对＆＃39;年龄＆＃39;进行分组。并获得＆＃39; Amount_Spent＆＃39;的sum基于分组变量。

library(data.table)
setDT(df1)[,.(Total_Amount = sum(Amount_Spent)) , 
       by = .(Age_Group = cut(Age, breaks = c(20, 30, 40, 50)))]

或dplyr

library(dplyr)
df1 %>%
    group_by(Age_Group = cut(Age, breaks = c(20, 30, 40, 50))) %>%
    summarise(Total_Amount = sum(Amount_Spent))
#     Age_Group Total_Amount
#      <fctr>        <int>
#1   (20,30]           30
#2   (30,40]           37
#3   (40,50]           32

Answer 2

以下是使用cut和aggregate的基本解决方案，然后使用setNames命名结果列：

mydf$Age_Group <- cut(mydf$Age, breaks = seq(20,50, by = 10))
with(mydf, setNames(aggregate(Amount_Spent ~ Age_Group, FUN = sum), 
                    c('Age_Group', 'Total_Spent')))

  Age_Group Total_Spent
1   (20,30]          30
2   (30,40]          37
3   (40,50]          32

我们可以使用gsub更进一步匹配您想要的输出（请注意，我不是正则表达式专家）：

mydf$Age_Group <- 
    gsub(pattern = ',',
     x = gsub(pattern = ']', 
     x = gsub(pattern = '(', x = mydf$Age_Group, replacement = '', fixed = T),
     replacement = '', fixed = T),
     replacement = ' - ', fixed = T)
with(mydf, setNames(aggregate(Amount_Spent ~ Age_Group, FUN = sum), 
                  c('Age_Group', 'Total_Spent')))

  Age_Group Total_Spent
1   20 - 30          30
2   30 - 40          37
3   40 - 50          32

如何在一列中将值拆分为相等的范围，并将R中另一列的关联值相加？

2 个答案: