我有一个我无法解决的问题!
我有一个名为GOLFRQuiz
qname uname first second third fourth fifth Best_try Forums_Read_Count
quiz_1 Alex 85 28 28 0 0 85 390
quiz_2 Alex 33 33 33 33 0 33 390
quiz_3 Alex 25 0 0 0 0 25 390
quiz_5 Alex 25 0 0 0 0 25 390
quiz_1 marko 42 71 71 71 0 71 50
quiz_2 marko 83 100 100 100 100 100 50
quiz_8 marko 75 0 0 0 0 75 50
我正在尝试为同一个uname和他/她的forums_read_count获取Best_Try的意思,并将它们保存在新的数据框中。主要问题是,一些用户做了3次测验,其他人做了7次测验! 所以我想要的是这样:
uname best_try Forums_read_count
Alex 42 390
marko 82 50`
用于复制数据帧:
structure(list(qname = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 5L
), .Label = c("quiz_1", "quiz_2", "quiz_3", "quiz_5", "quiz_8"
), class = "factor"), uname = structure(c(1L, 1L, 1L, 1L, 2L,
2L, 2L), .Label = c("Alex", "marko"), class = "factor"), first = structure(c(6L,
2L, 1L, 1L, 3L, 5L, 4L), .Label = c("25", "33", "42", "75", "83",
"85"), class = "factor"), second = structure(c(3L, 4L, 1L, 1L,
5L, 2L, 1L), .Label = c("0", "100", "28", "33", "71"), class = "factor"),
third = structure(c(3L, 4L, 1L, 1L, 5L, 2L, 1L), .Label = c("0",
"100", "28", "33", "71"), class = "factor"), fourth = structure(c(1L,
3L, 1L, 1L, 4L, 2L, 1L), .Label = c("0", "100", "33", "71"
), class = "factor"), fifth = structure(c(1L, 1L, 1L, 1L,
1L, 2L, 1L), .Label = c("0", "100"), class = "factor"), Best_try = structure(c(6L,
3L, 2L, 2L, 4L, 1L, 5L), .Label = c("100", "25", "33", "71",
"75", "85"), class = "factor"), Forums_Read_Count = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L), .Label = c("390", "50"), class = "factor")), .Names = c("qname",
"uname", "first", "second", "third", "fourth", "fifth", "Best_try",
"Forums_Read_Count"), row.names = c(NA, -7L), class = "data.frame")
答案 0 :(得分:3)
我认为您正在寻找aggregate()
,例如:
df2 <- aggregate(df$Best_try, list(df$uname), mean)
colnames(df2) <- c("uname", "avg_best_try")
df2$Forums_Read_Count <- df$Forums_Read_Count[match(df2$uname, df$uname)]
答案 1 :(得分:2)
你可以尝试
library(dplyr)
df1 %>%
group_by(uname) %>%
summarise(best_try=mean(Best_try),
Forums_read_count=unique(Forums_Read_Count))
# uname best_try Forums_read_count
#1 Alex 42 390
#2 marko 82 50
或者
library(data.table)
setDT(df1)[,list(best_try=mean(Best_try),
Forums_read_count=Forums_Read_Count[1L]) , uname]
# uname best_try Forums_read_count
#1: Alex 42 390
#2: marko 82 50