我正在使用R并希望在列中添加某些值,但前提是行满足条件。因此,如果我在下面有一个数据框data
:
Team MP Win
ATL 14 .4
ATL 25 .4
ATL 14 .4
BOS 14 .55
BOS 20 .55
BOS 9 .55
如何为MP
(14 + 25 + 14 = 53)和ATL
(14 + 20 + 9 = 43)存储BOS
的值?
编辑:如果我还要添加一个新的变量,将Win
乘以MP
/ sums
(其中sums
是MP
的总和,该怎么办?每个team
)。所以对于ATL
变量,我想要值.4 * 14/53和.4 * 25/53,对于BOS
我想要.55 * 14/43,.55 * 20/43 ,.55 * 9/43
答案 0 :(得分:4)
我认为这会产生你想要的东西:
修改
根据 akrun 的优秀答案,这是一个更紧凑的解决方案:
dat$cumsums <- ave(dat$MP, dat$Team, FUN=sum)
dat$newvar <- with(dat, Win * (MP/cumsums))
以前的解决方案
cumsums <- by(data = dat$MP, INDICES = dat$Team, FUN = sum)
cumsums.df <- data.frame(Team = names(cumsums), cumsums = as.numeric(cumsums))
dat <- merge(x=dat, y=cumsums.df, by = "Team")
dat$newvar <- with(dat, Win * (MP/cumsums))
<强>结果
dat
Team MP Win cumsums newvar
1 ATL 14 0.40 53 0.1056604
2 ATL 25 0.40 53 0.1886792
3 ATL 14 0.40 53 0.1056604
4 BOS 14 0.55 43 0.1790698
5 BOS 20 0.55 43 0.2558140
6 BOS 9 0.55 43 0.1151163
数据强>
dat <- read.csv(text="Team,MP,Win
ATL,14,.4
ATL,25,.4
ATL,14,.4
BOS,14,.55
BOS,20,.55
BOS,9,.55")
答案 1 :(得分:2)
我们可以使用base R
,dplyr
或data.table
执行此操作。
<强> 1。基础R
使用within
和ave
创建列
within(dat, cumsums <- ave(MP, Team, FUN=sum)
newvar <- Win*(MP/cumsums))[c(1:3, 5:4)]
# Team MP Win cumsums newvar
#1 ATL 14 0.40 53 0.1056604
#2 ATL 25 0.40 53 0.1886792
#3 ATL 14 0.40 53 0.1056604
#4 BOS 14 0.55 43 0.1790698
#5 BOS 20 0.55 43 0.2558140
#6 BOS 9 0.55 43 0.1151163
<强> 2。 data.table 强>
如果我们需要变量'cumsums','newvar',将'data.frame'转换为'data.table'(setDT(dat)
),请获取'{1}}'MP'列并使用它来创建按“团队”
sum
第3。 dplyr 强>
按“团队”分组后,使用library(data.table)
setDT(dat)[, c('cumsums', 'newvar') := {tmp=sum(MP)
list(tmp, tmp1 = Win*MP/tmp)}, by = Team][]
# Team MP Win cumsums newvar
#1: ATL 14 0.40 53 0.1056604
#2: ATL 25 0.40 53 0.1886792
#3: ATL 14 0.40 53 0.1056604
#4: BOS 14 0.55 43 0.1790698
#5: BOS 20 0.55 43 0.2558140
#6: BOS 9 0.55 43 0.1151163
创建列'cumsums'和'newvar'
mutate
library(dplyr)
dat %>%
group_by(Team) %>%
mutate(cumsums= sum(MP), newvar= Win*MP/cumsums)
# Team MP Win cumsums newvar
#1 ATL 14 0.40 53 0.1056604
#2 ATL 25 0.40 53 0.1886792
#3 ATL 14 0.40 53 0.1056604
#4 BOS 14 0.55 43 0.1790698
#5 BOS 20 0.55 43 0.2558140
#6 BOS 9 0.55 43 0.1151163
答案 2 :(得分:0)
aggregate
将完全符合您的要求
> data <- merge(data, aggregate(MP~Team, data = data, sum), by = 'Team', all.x = T)
> names(data) <- c('Team', 'MP', 'Win', 'SumByTeam')
> data$Value <- data$MP /data$SumByTeam * data$Win
> aggregate(Value ~ Team + MP.x, data = data, mean)
Team MP Value
1 BOS 9 0.1151163
2 ATL 14 0.1056604
3 BOS 14 0.1790698
4 BOS 20 0.2558140
5 ATL 25 0.1886792