我有一个按行组织的数据文件,带有x标签,y标签,然后是一个值。基本结构如下所示:
Factor_1,Factor_2,Number
apple,apple,1
banana,apple,1
apple,kiwi,6
apple,pear,1
watermelon,apple,8
banana,banana,3
banana,kiwi,2
banana,pear,1
banana,watermelon,9
kiwi,kiwi,9
pear,kiwi,4
kiwi,watermelon,4
pear,pear,3
pear,watermelon,9
watermelon,watermelon,1
...
...
使用这些数据,我正在使用此代码构建类似于相关矩阵的东西:
library(ggplot2)
library(reshape2)
d <- read.csv("my_file", head=TRUE, sep="\t")
x <- dcast(d, Factor_1~Factor_2)
x.m <- melt(x)
x.m <- ddply(x.m, .(variable))
(p <- ggplot(x.m, aes(variable, Factor_1)) + geom_tile(aes(fill = value), colour = "black") + scale_fill_gradient(low = "black", high = "green")
上面的代码给出了一个如下图:
如何重新排序数据以使用以下格式构建绘图,其中所有数据都分组在对角线下方?
A B K P W
W 1
P 3 9
K 9 4 4
B 3 2 1 9
A 1 1 6 1 8
答案 0 :(得分:3)
一种解决方案是立即以正确的顺序获取data.frame的前两列中的元素。 (注意:为了实现这一点,两列都应该是字符类。如果是因素,请先用as.character()
强制它们。)
完成以下操作后,情节应该按照您的喜好出现:
ordered <- apply(d[c("Factor_1", "Factor_2")], 1, sort)
d[c("Factor_1")] <- ordered[1,]
d[c("Factor_2")] <- ordered[2,]
要查看该代码的作用,请在重新排序之前输入data.frame的前6行:
Factor_1 Factor_2 Number
1 apple apple 1
2 banana apple 1
3 apple kiwi 6
4 apple pear 1
5 watermelon apple 8
6 banana banana 3
然后他们在这里:
Factor_1 Factor_2 Number
1 apple apple 1
2 apple banana 1
3 apple kiwi 6
4 apple pear 1
5 apple watermelon 8
6 banana banana 3
答案 1 :(得分:3)
我把它称为'水果':
fruit.tbl <- xtabs(V3 ~ V1+V2, data=fruits)
> melt(fruit.tbl)
V1 V2 value
1 apple apple 1
2 banana apple 0
3 kiwi apple 0
4 pear apple 0
5 watermelon apple 0
6 apple banana 1
7 banana banana 3
8 kiwi banana 0
9 pear banana 0
10 watermelon banana 0
11 apple kiwi 6
12 banana kiwi 2
13 kiwi kiwi 9
14 pear kiwi 0
15 watermelon kiwi 0
16 apple pear 1
17 banana pear 1
18 kiwi pear 4
19 pear pear 3
20 watermelon pear 0
21 apple watermelon 8
22 banana watermelon 9
23 kiwi watermelon 4
24 pear watermelon 9
25 watermelon watermelon 1
mfruit <- melt(fruit.tbl)
is.na(mfruit$value) <- mfruit$value==0
# Needed to swap x and y to get it the way you wanted
(p <- ggplot(melt(mfruit), aes(V2, V1,fill = value)) +
geom_tile( colour = "black") +
scale_fill_gradient(low = "black", high = "green")
)