我正在寻找一种基于两个值从R数据帧中获取频率计数的方法。我尝试了一些不同的语法,而且我在R中相当新。
> table(frequency.data.frame$value,frequency.data.frame$value_x)[!is.na(frequency.data.frame$id),]
Error in `[.default`(table(frequency.data.frame$value, frequency.data.frame$value_x), :
(subscript) logical subscript too long
> table(frequency.data.frame$value,frequency.data.frame$value_x[!is.na(frequency.data.frame$id),])
Error in frequency.data.frame$value_x[!is.na(frequency.data.frame$id), :
incorrect number of dimensions
给出
第一维。
as.data.frame(table(frequency.data.frame[!is.na(frequency.data.frame$id),]$value))
Var1 Freq
1 2 2
2 3 2
3 4 5
4 5 21
5 6 8
6 7 19
7 8 52
8 9 33
9 10 56
10 11 1
11 12 1
第二维。
as.data.frame(table(frequency.data.frame[!is.na(frequency.data.frame$id),]$value_x))
Var1 Freq
1 1 50
2 2 17
3 3 12
4 4 7
5 6 18
6 8 6
7 9 1
8 10 19
9 14 1
10 15 1
11 16 11
12 17 2
13 18 2
14 96 3
15 97 4
16 98 46
数据框样本数据提取......
> frequency.data.frame
id name factor value value_x
1 <NA> OSuppl=1 - Ardex | Imp_1=1 - 1 1 1
2 <NA> OSuppl=1 - Ardex | Imp_1=2 - 2 2 1
3 e7f0940c64001d4ab9d43ebd1e361292 OSuppl=1 - Ardex | Imp_1=3 - 3 3 1
4 <NA> OSuppl=1 - Ardex | Imp_1=4 - 4 4 1
5 2de771a03f49ce72eb721159933d4827 OSuppl=1 - Ardex | Imp_1=5 - 5 5 1
6 307ad612c3cc9fe5741c1fe75d1bc217 OSuppl=1 - Ardex | Imp_1=5 - 5 5 1
7 522f594612678f13f9dd5ee8f4f24df7 OSuppl=1 - Ardex | Imp_1=5 - 5 5 1
8 c1c32ac37f572fb259fe4e454bbdf743 OSuppl=1 - Ardex | Imp_1=5 - 5 5 1
9 d5b784d8f9508da7ac9573b535fe7147 OSuppl=1 - Ardex | Imp_1=5 - 5 5 1
10 e07439cdc15377d209413b31d9f80056 OSuppl=1 - Ardex | Imp_1=6 - 6 6 1
11 878a67dbbb428c65c83602fc112a24a0 OSuppl=1 - Ardex | Imp_1=6 - 6 6 1
12 5f7c27fb104685c26e53fc3267024539 OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
13 6b12a3591d89f7b70587406a0c4f92bb OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
14 7fb2f98867e0e100187f0b4f13baac46 OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
15 99a0ffaa2066e5c4806f2e30a446a31f OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
16 9d214544e8eaf3ea9c416a3dfbddb9f6 OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
17 b36f990b1e0d8c5f04a47d23b70c1022 OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
18 f2f9395bd9ddc16acd2253bd114aca64 OSuppl=1 - Ardex | Imp_1=7 - 7 7 1
19 4420e8499ab32631b389111935314468 OSuppl=1 - Ardex | Imp_1=8 - 8 8 1
...
期望的结果提取示例
Var2 Var1 Freq
...
6 5 1 5
7 6 1 2
8 7 1 7
9 8 1 1
...
我需要什么样的语法来获得所需的输出示例?
答案 0 :(得分:1)
library(plyr)
counts <- ddply(frequency.data.frame, .(frequency.data.frame$value_x, frequency.data.frame$value), nrow)
names(counts) <- c("value_x", "value", "Freq")
value_x value Freq
1 1 1 1
2 1 2 1
3 1 3 1
4 1 4 1
5 1 5 5
6 1 6 2
7 1 7 7
8 1 8 10
9 1 9 9
10 1 10 15
11 1 11 1
12 1 12 1
13 2 1 1
...
答案 1 :(得分:1)
由于我们只获得&#39;值&#39;,&#39; value_x&#39;基于非NA&#39; id,#{1}}基于非NA元素,subset
感兴趣的列,获取select
并转换为{{ 1}}
table
上述解决方案的data.frame
语法为
as.data.frame(table(subset(frequency.data.frame,
select = c('value', 'value_x'), !is.na(id))))