我的数据看起来如下:
Participant Round Total
1 100 5
1 101 8
1 102 12
1 200 42
2 100 14
2 101 71
40 100 32
40 101 27
40 200 18
我希望得到一张表Total
的最后Round
(200)减去Total
的{{1}}(100);
例如 - 参与者1 - 它是Round
。
最终输出应如下所示:
42 - 5 = 37
答案 0 :(得分:12)
基础R
aggregate(Total ~ Participant, df[df$Round %in% c(100, 200), ], diff)
# Participant Total
# 1 1 37
# 2 2
# 3 40 -14
或类似地与subset
aggregate(Total ~ Participant, df, subset = Round %in% c(100, 200), diff)
或data.table
library(data.table) ;
setDT(df)[Round %in% c(100, 200), diff(Total), by = Participant]
# Participant V1
# 1: 1 37
# 2: 40 -14
或使用二元连接
setkey(setDT(df), Round)
df[.(c(100, 200)), diff(Total), by = Participant]
# Participant V1
# 1: 1 37
# 2: 40 -14
或dplyr
library(dplyr)
df %>%
group_by(Participant) %>%
filter(Round %in% c(100, 200)) %>%
summarise(Total = diff(Total))
# Source: local data table [2 x 2]
#
# Participant Total
# 1 1 37
# 2 40 -14
答案 1 :(得分:2)
试试这个:
df <- read.table(header = TRUE, text = "
Participant Round Total
1 100 5
1 101 8
1 102 12
1 200 42
2 100 14
2 101 71
2 200 80
40 100 32
40 101 27
40 200 18")
library(data.table)
setDT(df)[ , .(Total = Total[Round == 200] - Total[Round == 100]), by = Participant]
答案 2 :(得分:1)
你可以试试这个
library(dplyr)
group_by(df, Participant) %>%
filter(row_number()==1 | row_number()==max(row_number())) %>%
mutate(df = diff(Total)) %>%
select(Participant, df) %>%
unique()
Source: local data frame [3 x 2]
Groups: Participant
Participant df
1 1 37
2 2 57
3 40 -14
答案 3 :(得分:1)
每个人都喜欢sqldf,所以如果你的要求不是申请,那就试试吧:
首先是一些测试数据:
df <- read.table(header = TRUE, text = "
Participant Round Total
1 100 5
1 101 8
1 102 12
1 200 42
2 100 14
2 101 71
2 200 80
40 100 32
40 101 27
40 200 18")
接下来使用SQL创建2列 - 一列用于100轮,一列用于200轮并减去它们
rolled <- sqldf("
SELECT tab_a.Participant AS Participant
,tab_b.Total_200 - tab_a.Total_100 AS Difference
FROM (
SELECT Participant
,Total AS Total_100
FROM df
WHERE Round = 100
) tab_a
INNER JOIN (
SELECT Participant
,Total AS Total_200
FROM df
WHERE Round = 200
) tab_b ON (tab_a.Participant = tab_b.Participant)
")