对于我的任务,我有一个由奥林匹克参加者组成的数据集。我需要总结每种奥林匹克运动的平均BMI。我在前面创建了BMI列,并具有以下代码:
olympics <- mutate(olympics, BMI = (Weight/(Height*Height)*10000))
answer6 <- olympics %>%
group_by(Sport, BMI, Sport) %>%
summarise()
这给我留下了一个包含13.000行的表。这需要是一个汇总表,每个运动有1行,然后是该运动的平均BMI。
此后,我需要将这些国家的前5个均值BMI降序存储在新对象中。最终结果将如下所示:
运动平均值_BMI 运动1 19.5 Sport2 19.2 运动3 19.1 体育4 18.6 Sport5 18.1
我的数据如下:
structure(list(Name = c("A Lamusi", "Juhamatti Tapio Aaltonen",
"Andreea Aanei", "Jamale (Djamel-) Aarrass (Ahrass-)", "Nstor Abad Sanjun",
"Nstor Abad Sanjun"), Sex = c("M", "M", "F", "M", "M", "M"),
Age = c(23L, 28L, 22L, 30L, 23L, 23L), Height = c(170L, 184L,
170L, 187L, 167L, 167L), Weight = c(60, 85, 125, 76, 64,
64), Team = c("China", "Finland", "Romania", "France", "Spain",
"Spain"), NOC = c("CHN", "FIN", "ROU", "FRA", "ESP", "ESP"
), Games = c("2012 Summer", "2014 Winter", "2016 Summer",
"2012 Summer", "2016 Summer", "2016 Summer"), Year = c(2012L,
2014L, 2016L, 2012L, 2016L, 2016L), Season = c("Summer",
"Winter", "Summer", "Summer", "Summer", "Summer"), City = c("London",
"Sochi", "Rio de Janeiro", "London", "Rio de Janeiro", "Rio de Janeiro"
), Sport = c("Judo", "Ice Hockey", "Weightlifting", "Athletics",
"Gymnastics", "Gymnastics"), Event = c("Judo Men's Extra-Lightweight",
"Ice Hockey Men's Ice Hockey", "Weightlifting Women's Super-Heavyweight",
"Athletics Men's 1,500 metres", "Gymnastics Men's Individual All-Around",
"Gymnastics Men's Floor Exercise"), Medal = c(NA, "Bronze",
NA, NA, NA, NA), BMI = c(20.7612456747405, 25.1063327032136,
43.2525951557093, 21.7335354170837, 22.9481157445588, 22.9481157445588
), weightcategories = structure(c(3L, 6L, 10L, 5L, 4L, 4L
), .Label = c("31-40", "41-50", "51-60", "61-70", "71-80",
"81-90", "91-100", "101-110", "111-120", "121-130", "131-140",
"141-150", "151-160"), class = "factor")), .Names = c("Name",
"Sex", "Age", "Height", "Weight", "Team", "NOC", "Games", "Year",
"Season", "City", "Sport", "Event", "Medal", "BMI", "weightcategories"
), row.names = c(NA, 6L), class = "data.frame")
答案 0 :(得分:1)
添加BMI
列后,只需group_by
来Sport
取mean
并选择前5个均值,然后arrange
降序订购。
library(dplyr)
olympics %>%
group_by(Sport) %>%
summarise(mean = mean(BMI)) %>%
top_n(5,mean) %>%
arrange(desc(mean))
在基数R中应该是
df1 <- aggregate(BMI~Sport, olympics, mean)
df1[order(df1$BMI, decreasing = TRUE)[1:5], ]