我有以下数据集:
dat <- structure(list(Probes = structure(1:6, .Label = c("1415670_at",
"1415671_at", "1415672_at", "1415673_at", "1415674_a_at", "1415675_at"
), class = "factor"), Genes = structure(c(2L, 1L, 4L, 5L, 6L,
3L), .Label = c("Atp6v0d1", "Copg1", "Dpm2", "Golga7", "Psph",
"Trappc4"), class = "factor"), bCD.ID.LN = c(1.133, 1.068, 1.01,
0.943, 1.048, 1.053), bCD.ID.LV = c(1.049, 1.006, 0.883, 0.799,
0.96, 1.104), bCD.ID.SP = c(1.124, 1.234, 1.029, 1.064, 1.118,
1.057), bCD.IP.LV = c(1.013, 1.082, 1.061, 0.982, 1.191, 1.053
), bCD.IP.SP = c(0.986, 1.102, 1.085, 0.997, 1.141, 1.041)), .Names = c("Probes",
"Genes", "bCD.ID.LN", "bCD.ID.LV", "bCD.ID.SP", "bCD.IP.LV",
"bCD.IP.SP"), row.names = c(NA, 6L), class = "data.frame")
看起来像这样:
> dat
Probes Genes bCD.ID.LN bCD.ID.LV bCD.ID.SP bCD.IP.LV bCD.IP.SP
1 1415670_at Copg1 1.133 1.049 1.124 1.013 0.986
2 1415671_at Atp6v0d1 1.068 1.006 1.234 1.082 1.102
3 1415672_at Golga7 1.010 0.883 1.029 1.061 1.085
4 1415673_at Psph 0.943 0.799 1.064 0.982 0.997
5 1415674_a_at Trappc4 1.048 0.960 1.118 1.191 1.141
6 1415675_at Dpm2 1.053 1.104 1.057 1.053 1.041
我想要的第三列向前计数行,其值为&gt; 1.1 所以它最终看起来像这样:
bCD.ID.LN 1
bCD.ID.LV 1
bCD.ID.SP 3
bCD.IP.LV 1
bCD.IP.SP 2
我该怎么做?
答案 0 :(得分:4)
我们可以根据数据集中的数字列在逻辑矩阵上尝试colSums
。
Count <- colSums(dat[-(1:2)] > 1.1, na.rm=TRUE)
如果我们需要data.frame
d1 <- data.frame(Cnames = names(Count), Count=unname(Count))
如果它是一个大型数据集,转换为逻辑矩阵可能效率不高,在这种情况下,最好使用vapply
循环
vapply(dat[-(1:2)], function(x) sum(x > 1.1, na.rm=TRUE), 0)
答案 1 :(得分:2)
又一个版本,这次使用dplyr
dat %>%
select(-c(Probes, Genes)) %>%
summarise_each (funs(sum((. > 1.1))))
答案 2 :(得分:1)
这是使用lapply()
的替代版本lapply(dat[-c(1:2)], function(x) length(which(x > 1.1)))
或者如果你想将它作为data.frame()
data.frame( lapply(dat[-c(1:2)], function(x) length(which(x > 1.1))))