year event athlete time
2000 100m Ato Boldon 9.95
2000 100m Brian Lewis 10.02
2000 100m Coby Miller 9.98
2000 100m Francis Obikwelu 9.97
2000 100m Jon Drummond 9.96
2000 100m Maurice Greene 9.86
2000 100m Michael Marsh 10.01
2000 100m Obadele Thompson 9.97
2000 100m Tony McCall 10.06
2001 100m Ato Boldon 9.88
2001 100m Aziz Zakari 10.04
2001 100m Bernard Williams 9.96
2001 100m Dwain Chambers 10
2001 100m Josh Norman 10.17
2001 100m Kim Collins 10.04
2001 100m Leonard Scott 10.05
2001 100m Mark Lewis-Francis 10.12
2001 100m Maurice Greene 9.9
2002 100m Bernard Williams 9.99
2002 100m Chris Williams 10.13
2002 100m Francis Obikwelu 10.01
2002 100m J.J. Johnson 9.95
2002 100m Kim Collins 9.98
2002 100m Marc Burns 10.18
2002 100m Mark Lewis-Francis 10.04
2002 100m Maurice Greene 9.89
2002 100m Shingo Suetsugu 10.05
2002 100m Taiwo Ajibade 10.18
2003 100m Bernard Williams 10.04
2003 100m Deji Aliu 9.95
2003 100m Dwain Chambers 10.06
2003 100m Hrist<f3>foros Ho<ed>dis 10.16
2003 100m J.J. Johnson 10.05
2003 100m John Capel 9.97
2003 100m Justin Gatlin 9.97
2003 100m Kim Collins 9.99
2003 100m Maurice Greene 9.94
2004 100m Asafa Powell 9.87
2004 100m Ato Boldon 10.09
2004 100m Christie van Wyk 10.09
2004 100m Darrel Brown 10.11
2004 100m Francis Obikwelu 10.02
2004 100m Justin Gatlin 9.92
2004 100m Maurice Greene 9.91
2004 100m Mickey Grimes 10.12
2004 100m Shawn Crawford 9.88
2005 100m Asafa Powell 9.77
2005 100m Aziz Zakari 9.99
2005 100m Dwight Thomas 10
2005 100m Francis Obikwelu 10.04
2005 100m Justin Gatlin 9.89
2005 100m Leonard Scott 9.94
2005 100m Marc Burns 9.96
2005 100m Maurice Greene 10.01
2005 100m Shawn Crawford 9.99
我正在使用R中的数据集,该数据集包含四列:年,事件,运动员和分数。每一行都是对给定事件和年份内运动员得分的观察。
我想做的是创建一个新列,该列将显示每个运动员的历史最佳成绩,而最好的成绩则表示为他们的最低成绩。
在excel中,我将创建一个minifs公式,该公式将检查给定年份的分数是否小于前几年的分数,如果是,则将成为运动员有史以来的最佳分数,如果不是,则将打印出任何内容他们以前的最高分是。
很抱歉,是否曾经有人问过并回答过,但是我们将不胜感激。
答案 0 :(得分:0)
Excel MINIFS函数返回在一系列值中满足一个或多个条件的最小数值。以下是简单R复制的示例:
# 1. Libraries
library(dplyr)
# 2. Data set
df <- data.frame(
year = c(2000, 2000, 2000),
athlete = c("Ato Boldon", "Brian Lewis", "Coby Miller"),
event = c("100m", "100m", "200m"),
score = c(9.95, 10.02, 9.98))
# 3. Replicate Excel 'MINIFS' function
# 3.1. One solution
df %>%
group_by(event) %>%
filter(score == min(score)) %>%
ungroup()
# 3.2. Another solution
df %>%
group_by(event) %>%
mutate(min_score = ifelse(event == "200m", min(score), score)) %>%
ungroup()
# 3.3. By 'athlete' for all time best score with 'year'
df_athlete_all_time <- df %>%
group_by(athlete) %>%
mutate(min_score_all_time = min(score)) %>%
subset(select = c("athlete", "min_score_all_time")) %>%
unique() %>% ungroup()
# 2.4. Merge with original data
df_merge <- left_join(df, df_athlete_all_time, by = c("athlete"))
# 2.5. What 'year' best score took place
df_merge %>%
filter(score == min_score_all_time)
# 2.6. Compare it to all the athlete's previous years scores and print out the smaller of the two
# Homework :)
答案 1 :(得分:0)
# example data
df = read.table(text = "
year event athlete time
2000 100m AtoBoldon 9.95
2001 100m AtoBoldon 10.02
2000 100m CobyMiller 9.98
2003 100m AtoBoldon 9.97
2001 100m CobyMiller 9.96
2003 100m CobyMiller 9.86
", header=T)
library(dplyr)
df %>%
group_by(athlete, event) %>% # for each event and ethlete
mutate(best_time = min(time), # get minimum time
year_best_time = year[time == best_time]) %>% # get year of minimum time
ungroup()
# # A tibble: 6 x 6
# year event athlete time best_time year_best_time
# <int> <fct> <fct> <dbl> <dbl> <int>
# 1 2000 100m AtoBoldon 9.95 9.95 2000
# 2 2001 100m AtoBoldon 10.0 9.95 2000
# 3 2000 100m CobyMiller 9.98 9.86 2003
# 4 2003 100m AtoBoldon 9.97 9.95 2000
# 5 2001 100m CobyMiller 9.96 9.86 2003
# 6 2003 100m CobyMiller 9.86 9.86 2003