我有一个包含表格列表的变量:list_of_tables
:t1,t2,t3,t4,t5,t6等
list_of_tables
(t1,t2,...)中的每个表都有8行。 E.g。
uuid | q_id | correct
-----------------------
1 | 1 | T
1 | 2 | T
1 | 3 | F
1 | 4 | F
1 | 5 | T
1 | 6 | F
1 | 7 | F
1 | 8 | T
我想要做的是从list_of_tables
创建一个新表或数据框,其中每一行都有正确的分数,这是基于正确的行数== T。
E.g
uuid | c_score
--------------
1 | 50% (4 out of 8 correct)
2 | ...
3 | ...
答案 0 :(得分:1)
我会使用data.table,特别是:
library(data.table)
dt1<-data.table(uuid=c(rep(1,5),rep(2,5)),c_score=c("T","F","F","F","T","T","T","T","F","F"))#mockup data
uuid c_score
1: 1 T
2: 1 F
3: 1 F
4: 1 F
5: 1 T
6: 2 T
7: 2 T
8: 2 T
9: 2 F
10: 2 F
然后:
dt1[,sum(c_score=="T")/.N,by=uuid]#count the rows that are "T" in c_score and divide them by the total ones..
uuid V1
1: 1 0.4
2: 2 0.6
如果是data.tables
列表,例如
l1<-list(a=data.table(uuid=c(rep(1,5),rep(2,5)),c_score=c("T","F","F","F","T","T","T","T","F","F")),b=data.table(uuid=c(rep(1,5),rep(2,5)),c_score=c("T","T","F","T","T","F","F","F","T","T")))
可以通过以下方式执行上述操作(前提是列名不会更改):
lapply(l1,function(x) x[,sum(c_score=="T")/.N,by=uuid])
yiedling:
$a
uuid V1
1: 1 0.4
2: 2 0.6
$b
uuid V1
1: 1 0.8
2: 2 0.4
答案 1 :(得分:1)
这是一个R base
解决方案:
# data
list_of_tables <- lapply(1:10,function(x)
data.frame(uuid=rep(x,10),q_id=1:10,correct=sample(c(TRUE,FALSE),10,replace = T)))
> list_of_tables
[[1]]
uuid q_id correct
1 1 1 TRUE
2 1 2 FALSE
3 1 3 TRUE
4 1 4 TRUE
5 1 5 FALSE
6 1 6 FALSE
7 1 7 TRUE
8 1 8 FALSE
9 1 9 TRUE
10 1 10 TRUE
[[2]]
uuid q_id correct
1 2 1 TRUE
2 2 2 FALSE
3 2 3 TRUE
4 2 4 FALSE
5 2 5 TRUE
6 2 6 TRUE
7 2 7 FALSE
8 2 8 TRUE
9 2 9 FALSE
10 2 10 FALSE
new_t <- do.call(rbind,
lapply(list_of_tables,function(x) data.frame(uuid=unique(x$uuid),c_score = (sum(x$correct)/nrow(x))*100)))
在这种情况下,do.call
会将所有内容放回到单个DF中......但如果您想保留列表,则可以跳过它。
> new_t
uuid c_score
1 1 60
2 2 50
3 3 80
4 4 70
5 5 70
6 6 40
7 7 60
8 8 50
9 9 50
10 10 50