I wanted to get the same output as the one on: https://www.r-bloggers.com/how-to-aggregate-data-in-r/
My output is:
Group.1 Group.2 Name Role Shift Salary Age
1 Cook Dinner NA NA NA 1800 25.0
2 Manager Dinner NA NA NA 2000 41.0
3 Server Dinner NA NA NA 1650 27.5
4 Cook Lunch NA NA NA 1200 24.0
5 Manager Lunch NA NA NA 2200 32.0
6 Server Lunch NA NA NA 1350 24.0
with Columns containing NA
s. Including "na.rm=TRUE"
and "na.action=NULL"
did not make any difference.
I also keep receiving warnings:
Warning messages: 1: In mean.default(X[[i]], ...) : argument is not numeric or logical: returning NA
How do I modify aggregate()
which would make it omit unnecessary columns and\or NA
values without having to resort to using dplyr
?
Thanks
agg = aggregate(data,
by = list(data$Role, data$Shift),
FUN = mean, na.rm=TRUE, na.action=NULL)
答案 0 :(得分:1)
Let's take a look at your aggregate
call
aggregate(data, by = list(data$Role, data$Shift), FUN = mean)
Here you are calculating the average of values across all columns of data
by data$Role
and data$Shift
(which are your grouping variables).
The error is pretty self-explanatory in telling you that you are trying to calculate the mean of non-numeric entries. data$Name
, data$Role
and data$Shift
are all non-numeric columns.
I assume you are after
aggregate(. ~ Role + Shift, data = data[, -1], FUN = mean)
# Role Shift Salary Age
#1 Cook Dinner 1800 25.0
#2 Manager Dinner 2000 41.0
#3 Server Dinner 1650 27.5
#4 Cook Lunch 1200 24.0
#5 Manager Lunch 2200 32.0
#6 Server Lunch 1350 24.0
The .
(dot) here denotes all variables except the ones on the RHS of the ~
(tilde). Notice how we exclude data$Name
by passing data[, -1]
as the data
argument to aggregate
.
Or using the by
syntax
aggregate(data[, c("Salary", "Age")], by = list(data$Role, data$Shift), FUN = "mean")
Here the x
argument refers to all columns the values of which you want to aggregate according to groups defined in by
.
In response to your comment, to aggregate only by Role
aggregate(cbind(Salary, Age) ~ Role, data = data[, -1], FUN = mean)
# Role Salary Age
#1 Cook 1500 24.50
#2 Manager 2100 36.50
#3 Server 1500 25.75