来自表列表的R新表

时间:2017-07-17 12:14:25

标签: r

我有一个包含表格列表的变量:list_of_tables:t1,t2,t3,t4,t5,t6等

list_of_tables(t1,t2,...)中的每个表都有8行。 E.g。

uuid | q_id | correct 
-----------------------
  1  | 1    |   T     
  1  | 2    |   T     
  1  | 3    |   F     
  1  | 4    |   F     
  1  | 5    |   T     
  1  | 6    |   F     
  1  | 7    |   F     
  1  | 8    |   T     

我想要做的是从list_of_tables创建一个新表或数据框,其中每一行都有正确的分数,这是基于正确的行数== T。

E.g

uuid | c_score
--------------
  1  |  50% (4 out of 8 correct)
  2  |  ...
  3  |  ...

2 个答案:

答案 0 :(得分:1)

我会使用data.table,特别是:

library(data.table)
dt1<-data.table(uuid=c(rep(1,5),rep(2,5)),c_score=c("T","F","F","F","T","T","T","T","F","F"))#mockup data

        uuid c_score
     1:    1       T
     2:    1       F
     3:    1       F
     4:    1       F
     5:    1       T
     6:    2       T
     7:    2       T
     8:    2       T
     9:    2       F
    10:    2       F

然后:

dt1[,sum(c_score=="T")/.N,by=uuid]#count the rows that are "T" in c_score and divide them by the total ones..

    uuid  V1
1:    1 0.4
2:    2 0.6

编辑:

如果是data.tables列表,例如

l1<-list(a=data.table(uuid=c(rep(1,5),rep(2,5)),c_score=c("T","F","F","F","T","T","T","T","F","F")),b=data.table(uuid=c(rep(1,5),rep(2,5)),c_score=c("T","T","F","T","T","F","F","F","T","T")))

可以通过以下方式执行上述操作(前提是列名不会更改):

lapply(l1,function(x) x[,sum(c_score=="T")/.N,by=uuid])

yiedling:

    $a
       uuid  V1
    1:    1 0.4
    2:    2 0.6

    $b
       uuid  V1
    1:    1 0.8
    2:    2 0.4

答案 1 :(得分:1)

这是一个R base解决方案:

# data
list_of_tables <- lapply(1:10,function(x)
 data.frame(uuid=rep(x,10),q_id=1:10,correct=sample(c(TRUE,FALSE),10,replace = T)))

> list_of_tables
[[1]]
   uuid q_id correct
1     1    1    TRUE
2     1    2   FALSE
3     1    3    TRUE
4     1    4    TRUE
5     1    5   FALSE
6     1    6   FALSE
7     1    7    TRUE
8     1    8   FALSE
9     1    9    TRUE
10    1   10    TRUE

[[2]]
   uuid q_id correct
1     2    1    TRUE
2     2    2   FALSE
3     2    3    TRUE
4     2    4   FALSE
5     2    5    TRUE
6     2    6    TRUE
7     2    7   FALSE
8     2    8    TRUE
9     2    9   FALSE
10    2   10   FALSE


new_t <- do.call(rbind,
                 lapply(list_of_tables,function(x) data.frame(uuid=unique(x$uuid),c_score = (sum(x$correct)/nrow(x))*100)))

在这种情况下,do.call会将所有内容放回到单个DF中......但如果您想保留列表,则可以跳过它。

> new_t
   uuid c_score
1     1      60
2     2      50
3     3      80
4     4      70
5     5      70
6     6      40
7     7      60
8     8      50
9     9      50
10   10      50