我正在尝试运行for循环来计算因子变量水平的相关性。我的数据集中的32个团队中的每个团队都有16行数据。我想将年份与每个团队的积分相关联。我可以一个接一个地做,但是想在循环中变得更好。
correlate <- data %>%
select(Team, Year, Points_Game) %>%
filter(Team == "ARI") %>%
select(Year, Points_Game)
cor(correlate)
我通过以下方式使对象成为“团队”:
teams <- levels(data$Team)
使用[i]迭代所有32个团队以获取每个团队的年份和得分之间的关联会有所帮助!
答案 0 :(得分:1)
require(dplyr)
# dummy data
data = data.frame(
Team = sapply(1:32, function(x) paste0("T", x)),
Year = rep(c(2000:2009), 32),
Points_Game = rnorm(320, 100, 10)
)
# find correlation of Year and Points_Game for each team
# r - correlation coefficient
correlate <- data %>%
group_by(Team) %>%
summarise(r = cor(Year, Points_Game))
答案 1 :(得分:0)
data.table方式:
library(data.table)
# dummy data (same as @Aleksandr's)
dat <- data.table(
Team = sapply(1:32, function(x) paste0("T", x)),
Year = rep(c(2000:2009), 32),
Points_Game = rnorm(320, 100, 10)
)
# find correlation of Year and Points_Game for each Team
result <- dat[ , .(r = cor(Year, Points_Game)), by = Team]