我有一个由data.frame
个对象和239
变量组成的大546639
个。 data.frame
的元素包括A
,B
或0
。现在我想知道每一行中每个元素的数量。以下是data.frame
。
1 rs22233… B B B B B B B B B B B
2 rs38622… B B B B B B B B A B A
3 rs13933… B B A B B B B B B B B
4 rs38637… B B A A A B B B A B A
5 rs12554… B B B B A B A B B B B
6 rs41105… A A A A B A B A A A B
答案 0 :(得分:2)
我们可以使用apply
使用table
按行计算:
apply(df[-c(1,2)],1,table)
# [[1]]
#
# B
# 11
#
# [[2]]
#
# A B
# 2 9
#
# [[3]]
#
# A B
# 1 10
#
# [[4]]
#
# A B
# 5 6
#
# [[5]]
#
# A B
# 2 9
#
# [[6]]
#
# A B
# 8 3
答案 1 :(得分:2)
使用<!DOCTYPE html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/typeit/5.0.2/typeit.min.js"></script>
<script>
$(function () {
new TypeIt('#element', {
speed: 45
})
.type('The programers')
.pause(300)
.options({ speed: 200 })
.delete(3)
.options({ speed: 45 })
.pause(300)
.type('mer\'s wife sent him to teh sto.')
.pause(500)
.options({ speed: 200 })
.delete(7)
.type('he store.')
.pause(500)
.break()
.options({ speed: 45 })
.type('Her instructions were <em>"Buy butter. See if they have 10 eggs. If they do, buy ten.</em>"')
.pause(1000)
.break()
.type('He came back with ten packs of butter. ')
.pause(1000)
.type('Because they have eggs.');
});
</script>
</head>
<body>
<h1 id="element"></h1>
</body>
</html>
(感谢@thelatemail):
table
或(慢):
table(factor(unlist(df[-1]), levels = c("A", "B", "0")), row(df[-1]))
# 1 2 3 4 5 6
# A 0 2 1 5 2 8
# B 11 9 10 6 9 3
# 0 0 0 0 0 0 0
说明:sapply(split(df, 1:nrow(df)), function(x)
table(factor(unlist(x[, -1]), levels = c("A", "B", "0"))))
# 1 2 3 4 5 6
#A 0 2 1 5 2 8
#B 11 9 10 6 9 3
#0 0 0 0 0 0 0
确保factor(..., levels = c("A", "B", "0"))
始终报告相同的三个table
级别的计数,然后您可以将其存储在factor
中。
使用matrix
:
rle
使用lapply(split(df, 1:nrow(df)), function(x)
as.data.frame(unclass(rle(as.character(sort(unlist(x[, -1])))))))
#$`1`
# lengths values
#1 11 B
#
#$`2`
# lengths values
#1 2 A
#2 9 B
#
#$`3`
# lengths values
#1 1 A
#2 10 B
#
#$`4`
# lengths values
#1 5 A
#2 6 B
#
#$`5`
# lengths values
#1 2 A
#2 9 B
#
#$`6`
# lengths values
#1 8 A
#2 3 B
和tidyr::gather
:
dplyr::count
library(tidyverse);
df %>%
gather(key, val, -V2) %>%
count(V2, val)
## A tibble: 11 x 3
#V2 val n
#<fct> <chr> <int>
#1 rs12554… A 2
#2 rs12554… B 9
#3 rs13933… A 1
#4 rs13933… B 10
#5 rs22233… B 11
#6 rs38622… A 2
#7 rs38622… B 9
#8 rs38637… A 5
#9 rs38637… B 6
#10 rs41105… A 8
#11 rs41105… B 3
答案 2 :(得分:1)
使用dplyr
和tidyr
:
library(dplyr)
library(tidyr)
df %>%
gather(key, value, V3:V13) %>%
group_by(V2) %>%
count(value) %>%
spread(value, n)
# A tibble: 6 x 3
# Groups: V2 [6]
V2 A B
<fct> <int> <int>
1 rs12554… 2 9
2 rs13933… 1 10
3 rs22233… NA 11
4 rs38622… 2 9
5 rs38637… 5 6
6 rs41105… 8 3