我有一个像这样的data.frame:
P Stat V Points
1 Goals 2 10
1 Assists 1 3
2 Goals 1 5
2 Assists 1 3
我想把它转换成这样的东西:
P Goals Assists Points
1 2 1 13
2 1 1 8
目前我正在使用dcast,如下所示:
dcast(stats, P ~ Stat, value.var = "V")
,即使没有" Points"也可以。当我添加Points时,它开始用_1,_2等
任何帮助表示赞赏。这不是一个学校项目,我只是一个好奇的顾问,试图刷新我对我感兴趣的问题的统计技能!
答案 0 :(得分:1)
我们可以执行dcast
,然后添加“积分”。柱
library(data.table)
dcast(setDT(d1), P~Stat, value.var = "V")[, Points := sum(d1$Points)][]
# P Assists Goals Points
#1: 1 1 2 13
答案 1 :(得分:1)
有(至少)两种可能性来实现所需的结果。
data.table
的最新版本允许指定多个value.var
作为dcast()
的参数:
library(data.table) # version 1.10.4 used
dcast(DT, P ~ Stat, value.var = list("V", "Points"))
# P V_Assists V_Goals Points_Assists Points_Goals
#1: 1 1 2 3 10
#2: 2 1 1 3 5
如果只需要一个Points
列,则需要添加点并删除不必要的列。通过链接,这可以在一个声明中完成,但不是很简洁。
dcast(DT, P ~ Stat, value.var = list("V", "Points"))[
, Points := Points_Assists + Points_Goals][
, c("Points_Assists", "Points_Goals") := NULL][]
# P V_Assists V_Goals Points
#1: 1 1 2 13
#2: 2 1 1 8
或者,V
的dcast和点的聚合可以分开进行,结果随后加入:
# dcast
temp1 <- dcast(DT, P ~ Stat, value.var = "V")
temp1
# P Assists Goals
#1: 1 1 2
#2: 2 1 1
# sum points by P
temp2 <- DT[, .(Points = sum(Points)), by = P]
temp2
# P Points
#1: 1 13
#2: 2 8
现在需要加入两个结果:
temp1[temp2, on = "P"]
# P Assists Goals Points
#1: 1 1 2 13
#2: 2 1 1 8
或合并为一个声明:
dcast(DT, P ~ Stat, value.var = "V")[DT[, .(Points = sum(Points)), by = P], on = "P"]
代码看起来比第一个变体更直接和简洁。
library(data.table)
DT <- fread(
"P Stat V Points
1 Goals 2 10
1 Assists 1 3
2 Goals 1 5
2 Assists 1 3")
请注意,fread()
默认返回data.table对象。如果DT
仍然是data.frame,则需要通过
setDT(DT)