我有很多运动员在比赛期间的位置数据。比赛的每个季度最多30分钟。我的数据的一个例子是:
> df
StartValue Athlete Quarter Position
1 0.00 Paul Q1 Bench
2 5.35 Paul Q1 Defender
3 19.26 Paul Q1 Bench
4 23.32 Paul Q1 Defender
5 0.00 Paul Q2 Bench
6 9.08 Paul Q2 Defender
7 13.11 Paul Q2 Defender
8 0.00 Paul Q3 Defender
9 7.36 Paul Q3 Defender
10 2.51 Paul Q3 Bench
11 6.44 Paul Q4 Bench
12 22.47 Paul Q4 Bench
13 0.00 Paul Q4 Defender
14 24.38 Paul Q4 Defender
15 11.36 Paul Q4 Defender
我现在希望创建一个新列df$EndValue
,它会获取下面行的StartValue
并将其放在同一列中。当一个季度的最后一个条目出现时,必须在df$EndValue
中放置30。例如,前几行是:
> df
StartValue Athlete Quarter Position EndValue
1 0.00 Paul Q1 Bench 5.35
2 5.35 Paul Q1 Defender 19.26
3 19.26 Paul Q1 Bench 23.32
4 23.32 Paul Q1 Defender 30.00
5 0.00 Paul Q2 Bench 9.08
我对data.frame的预期输出是:
Output <- data.frame(StartValue=c(0, 5.35, 19.26, 23.32,
0.00, 9.08, 13.11, 0,
2.51, 7.36, 0.0, 6.44,
11.36, 22.47, 24.38),
EndValue=c(5.35, 19.26, 23.32, 30,
9.08, 13.11, 30, 2.51,
7.36, 30, 6.44, 11.36,
22.47, 24.38, 30),
Athlete = c('Paul', 'Paul', 'Paul', 'Paul',
'Paul', 'Paul', 'Paul','Paul',
'Paul', 'Paul', 'Paul','Paul',
'Paul', 'Paul', 'Paul'),
Quarter = c('Q1', 'Q1', 'Q1', 'Q1',
'Q2', 'Q2', 'Q2', 'Q3',
'Q3', 'Q3', 'Q4', 'Q4',
'Q4', 'Q4', 'Q4'),
Position = c('Bench','Defender','Bench','Defender',
'Bench','Defender','Defender','Defender',
'Defender','Bench','Bench','Bench',
'Defender', 'Defender', 'Defender'))
我在这30分钟的时间里有很多运动员的数据,那么我该如何快速添加这个新专栏呢?
谢谢。
答案 0 :(得分:2)
setDT
将数据帧转换为数据表。按Quarter
分组,并将最后一个值指定为30并生成EndValue
列。
library('data.table')
修改强>
在您的评论中,您要求使用唯一值更改每个季度的结束值。首先将StartValue
分配给EndValue
,然后找到每个季度中最后一个值的行索引。在下一步中,使用EndValue
31 for Q1, 32 for Q2, 33 for Q3 and 34 for Q4.
我创造了两个球员 - 保罗和鲍勃。除了他们的名字,他们都有相同的数据。
# sample data
setDT( df ) # convert data frame to data table by reference
df1 <- copy(df) # replicate data by copying df
df[, Athlete := 'Bob'] # asssign Athlete with Bob player
df <- rbindlist(l = list( df1, df) ) # combine df1 and df
# sort StartValue by player and quarter
df <- df[order(StartValue), .SD, by = .( Athlete, Quarter ) ]
# assign start to endvalue and with unique number per player per quarter
df[, EndValue := StartValue ] # Assign StartValue to EndValue
# remove 1st, shift values up and assign NA to last
df[, EndValue := c( EndValue[-1], NA ), by = .(Athlete, Quarter )]
df[ i = df[, .I[.N], by = .(Quarter, Athlete)][, V1],
j = EndValue := rep( c(31,32,33,34),
length( df[, unique(Athlete) ] ) ) ]
df
# Athlete Quarter StartValue Position EndValue
# 1: Paul Q1 0.00 Bench 5.35
# 2: Paul Q1 5.35 Defender 19.26
# 3: Paul Q1 19.26 Bench 23.32
# 4: Paul Q1 23.32 Defender 31.00
# 5: Paul Q2 0.00 Bench 9.08
# 6: Paul Q2 9.08 Defender 13.11
# 7: Paul Q2 13.11 Defender 32.00
# 8: Paul Q3 0.00 Defender 2.51
# 9: Paul Q3 2.51 Bench 7.36
# 10: Paul Q3 7.36 Defender 33.00
# 11: Paul Q4 0.00 Defender 6.44
# 12: Paul Q4 6.44 Bench 11.36
# 13: Paul Q4 11.36 Defender 22.47
# 14: Paul Q4 22.47 Bench 24.38
# 15: Paul Q4 24.38 Defender 34.00
# 16: Bob Q1 0.00 Bench 5.35
# 17: Bob Q1 5.35 Defender 19.26
# 18: Bob Q1 19.26 Bench 23.32
# 19: Bob Q1 23.32 Defender 31.00
# 20: Bob Q2 0.00 Bench 9.08
# 21: Bob Q2 9.08 Defender 13.11
# 22: Bob Q2 13.11 Defender 32.00
# 23: Bob Q3 0.00 Defender 2.51
# 24: Bob Q3 2.51 Bench 7.36
# 25: Bob Q3 7.36 Defender 33.00
# 26: Bob Q4 0.00 Defender 6.44
# 27: Bob Q4 6.44 Bench 11.36
# 28: Bob Q4 11.36 Defender 22.47
# 29: Bob Q4 22.47 Bench 24.38
# 30: Bob Q4 24.38 Defender 34.00
# Athlete Quarter StartValue Position EndValue
答案 1 :(得分:1)
以下是使用int count=0;
for (int i = 0; i < 100; i++)
{
Thread.Sleep(1000);
count++;
ProgressReportAngular(count);
}
:
dplyr
如果它变得更复杂,例如多个季度长度不同的游戏,我会创建一个新的library(dplyr)
quarter_lengths <- c(Q1 = 31, Q2 = 32, Q3 = 30, Q4 = 33)
df %>%
group_by(Athlete, Quarter) %>%
mutate(EndValue = c(StartValue[-1], quarter_lengths[Quarter[1]]))
,其长度为data.frame
,inner_join
。