我有一个从2所不同大学收集的数据集。每个人都包含学生的信息,例如国家/地区,年级,年龄等。我想在每所大学中提取每个国家/地区(按国家/地区分组)的最低,平均,最高,年级和年龄标准差并创建表格。
我正在使用的代码如下。我为每所大学重复代码的最小,最大和标准差。重复此过程是可以的,但是当我创建一个表时,我需要回到excel来合并从此代码中获得的统计信息。那么在R中有没有直接的方法可以做到这一点?
stats_gr <- data %>%
select(Country, Grades, Age) %>%
group_by(country) %>%
summarise(Grades = mean(Grades), Age=mean(Age))
答案 0 :(得分:1)
我使用knitr的kable()函数解决了这个问题。
library(dplyr)
df <- tibble::tribble(
~University, ~Countries, ~Grades, ~Age,
"University-1", "USA", 46, 29,
"University-1", "UK", 84, 30,
"University-1", "Sweden", 5, 28,
"University-1", "Spain", 40, 26,
"University-1", "Portugal", 49, 29,
"University-1", "Italy", 16, 24,
"University-1", "USA", 34, 19,
"University-1", "UK", 66, 28,
"University-1", "Sweden", 9, 25,
"University-1", "Spain", 80, 20,
"University-1", "Portugal", 55, 20,
"University-1", "Italy", 4, 21,
"University-1", "USA", 93, 18,
"University-1", "UK", 62, 28,
"University-1", "Sweden", 80, 30,
"University-2", "Spain", 1, 22,
"University-2", "Portugal", 56, 25,
"University-2", "Italy", 9, 29,
"University-2", "USA", 40, 21,
"University-2", "UK", 54, 20,
"University-2", "Sweden", 60, 24,
"University-2", "Spain", 77, 21,
"University-2", "Portugal", 22, 18,
"University-2", "Italy", 53, 29,
"University-2", "USA", 11, 21,
"University-2", "UK", 65, 27,
"University-2", "Sweden", 24, 27,
"University-2", "Spain", 18, 23,
"University-2", "Portugal", 73, 19,
"University-2", "Italy", 79, 22,
"University-1", "USA", 2, 26,
"University-1", "UK", 83, 23,
"University-1", "Sweden", 5, 19,
"University-1", "Spain", 75, 19,
"University-1", "Portugal", 12, 21,
"University-1", "Italy", 68, 29,
"University-1", "USA", 100, 21,
"University-1", "UK", 49, 21,
"University-1", "Sweden", 81, 20,
"University-1", "Spain", 99, 23,
"University-1", "Portugal", 82, 24,
"University-1", "Italy", 23, 26,
"University-1", "USA", 86, 30,
"University-1", "UK", 50, 20,
"University-1", "Sweden", 4, 19,
"University-2", "Spain", 12, 25,
"University-2", "Portugal", 12, 21,
"University-2", "Italy", 45, 21,
"University-2", "USA", 16, 26,
"University-2", "UK", 56, 23,
"University-2", "Sweden", 63, 24,
"University-2", "Spain", 37, 28,
"University-2", "Portugal", 86, 21,
"University-2", "Italy", 95, 18,
"University-2", "USA", 56, 20,
"University-2", "UK", 27, 20,
"University-2", "Sweden", 3, 27,
"University-2", "Spain", 18, 27,
"University-2", "Portugal", 68, 27,
"University-2", "Italy", 48, 21
)
df %>%
group_by(University,Countries) %>%
summarise(Grades_min = min(Grades),
Grades_mean = mean(Grades),
Grades_max = max(Grades),
Grades_sd = sd(Grades),
Age_min = min(Age),
Age_mean= mean(Age),
Age_max = max(Age),
Age_sd = sd(Age)) %>%
knitr::kable(col.names = c("University",
"Country",
"Min Grade",
"Mean Grade",
"Max Grade",
"Grade SD",
"Min Age",
"Mean Age",
"Max Age",
"Age SD"))
|University |Country | Min Grade| Mean Grade| Max Grade| Grade SD| Min Age| Mean Age| Max Age| Age SD|
|:------------|:--------|---------:|----------:|---------:|--------:|-------:|--------:|-------:|--------:|
|University-1 |Italy | 4| 27.75000| 68| 27.95681| 21| 25.00000| 29| 3.366502|
|University-1 |Portugal | 12| 49.50000| 82| 28.82707| 20| 23.50000| 29| 4.041452|
|University-1 |Spain | 40| 73.50000| 99| 24.61030| 19| 22.00000| 26| 3.162278|
|University-1 |Sweden | 4| 30.66667| 81| 38.64022| 19| 23.50000| 30| 4.847680|
|University-1 |UK | 49| 65.66667| 84| 15.31883| 20| 25.00000| 30| 4.195235|
|University-1 |USA | 2| 60.16667| 100| 38.98931| 18| 23.83333| 30| 5.192944|
|University-2 |Italy | 9| 54.83333| 95| 29.81554| 18| 23.33333| 29| 4.589844|
|University-2 |Portugal | 12| 52.83333| 86| 29.54601| 18| 21.83333| 27| 3.488075|
|University-2 |Spain | 1| 27.16667| 77| 27.06597| 21| 24.33333| 28| 2.804758|
|University-2 |Sweden | 3| 37.50000| 63| 29.03446| 24| 25.50000| 27| 1.732051|
|University-2 |UK | 27| 50.50000| 65| 16.38088| 20| 22.50000| 27| 3.316625|
|University-2 |USA | 11| 30.75000| 56| 21.06142| 20| 22.00000| 26| 2.708013|
此方法的好处是,如果您想使用rmarkdown编织成单词,它将很好地工作。如果这样做,该表格将如下所示。
您可以使用相关的kable参数来控制位数,表格标题或列对齐。
答案 1 :(得分:0)
也许stargazer
适合您:
library(stargazer)
stats_gr <- data %>%
select(Country, Grades, Age) %>%
group_by(country) %>% stargazer(type="text")