我有一个数据框(总共5000行),看起来像这样:
total_data
Ind.Code Stage Group Behaviour
1 Number1 first a INT
2 Number2 first a EAT
3 Number2 second a INT
4 Number1 third a STA
5 Number3 first a INT
6 Number1 second b EAT
7 Number1 first b PAS
8 Number3 third b EAT
9 Number2 third b INT
10 Number1 first b PAS
现在我要创建类似这样的内容:
summarized_data
Ind.Code Stage Group Behaviour Count
1 Number1 first a INT 1
2 Number1 first a EAT 0
3 Number1 first a STA 0
4 Number1 first a PAS 0
5 Number1 second a INT 0
6 Number1 second a EAT 0
7 Number1 second a STA 0
8 Number1 second a PAS 0
9 Number1 third a INT 0
10 Number1 third a EAT 0
11 Number1 third a STA 1
12 Number1 third a PAS 0
对于其他工业代码和组,依此类推。
但是,我找不到最佳的方法。我已经尝试过aggregate
函数:
aggregate(as.integer(total_data$Behaviour),
by = list(total_data$Ind.Code, total_data$Stage, total_data$Group, total_data$Behaviour,
FUN = sum)
但是显然,对于某些数据点来说,它似乎可以正常工作,而对于另一些数据点,“ Count”的值似乎增加了一倍。
有人可以帮助我找到正确的代码吗?
非常感谢。