我有这个表(下面),我已经转换为从长到长,我想要绘制" logFPKM"对于每个"样本"按" gene_id"分组使用ggplot2(geom_bar)。另外,我希望相应地将标准错误(" se")与每个样本-logFPKM相匹配。这是我桌子的负责人:
gene_id sample logFPKM se
PCBA_RS20130 CW 2.82138999505533 0.0510157917418624
PCBA_RS20130 CW 2.82138999505533 0.0614430466292
PCBA_RS20130 CW 2.82138999505533 0.15767922584651
PCBA_RS20130 W24 3.30091961220465 0.0510157917418624
PCBA_RS20130 W24 3.30091961220465 0.0614430466292
PCBA_RS20130 W24 3.30091961220465 0.15767922584651
PCBA_RS20130 W72 3.03503118006935 0.0510157917418624
PCBA_RS20130 W72 3.03503118006935 0.0614430466292
PCBA_RS20130 W72 3.03503118006935 0.15767922584651
PCBA_RS20135 CW 6.9229217846409 0.00450698521094983
PCBA_RS20135 CW 6.9229217846409 0.0224906710108503
PCBA_RS20135 CW 6.9229217846409 0.0917704536947984
PCBA_RS20135 W24 6.84058248620209 0.00450698521094983
PCBA_RS20135 W24 6.84058248620209 0.0224906710108503
PCBA_RS20135 W24 6.84058248620209 0.091770453694798
PCBA_RS20135 W72 5.95705243892052 0.00450698521094983
PCBA_RS20135 W72 5.95705243892052 0.0224906710108503
PCBA_RS20135 W72 5.95705243892052 0.0917704536947984
现在(下面)的代码成功地获取了sample(x)列中每个类(CW,W24,W72)的logFPKM(y)值。然而," se"对于每个logFPKM条,值被绘制三次。而且我在尝试制作" logFPKM"和" se" '一起走'与示例类。如何关联" se"每个"样本的值#34;类(CW,W24,W72)正确到每个logFPKM?
ggplot(both_long, aes(x=sample,y=logFPKM,fill=factor(gene_id), ymax=logFPKM+se, ymin=logFPKM-se)) +
geom_bar(position = "dodge", stat = "identity") +
geom_errorbar(position = "dodge")
这里是ggplot2输出的样子:
以及这里的输入:
dput(both_long)
结构(列表(V1 =结构(c(1L,1L,1L,1L,1L,1L,1L,1L,
1L,2L,2L,2L,2L,2L,2L,2L,2L) ,2L),. Label = c(" PCBA_RS20130"," PCBA_RS20135"),class =" factor"),V2 =结构(c( 1L,1L,
1L,2L,2L,2L,3L,3L,3L,1L,1L,1L,2L,2L,2L,3L,3L,3L
),. Label = c( " CW"," W24"," W72"),class =" factor"),V3 = c(2.82138999505533,
2.82138999505533,2.82138999505533,3.30091961220465,3.30091961220465,3.30091961220465
,3.03503118006935,3.03503118006935,3.03503118006935,
6.9229217846409,6.9229217846409,6.9229217846409,6.84058248620209,6.84058248620209
,6.84058248620209,5.95705243892052,5.95705243892052,5.95705243892052
) ,V4 = c(0.0510157917418624,0.0614430466292,
0.15767922584651,0.0510157917418624,0.0614430466292,0.15767922584651,
0.0510157917418624,0.0614430466292,0.15767922584651,0.00450698521094983,
0.022490671010505,0.0917704536797984,0.00450698 521094983,
0.0224906710108503,0.0917704536947984,0.00450698521094983,
0.0224906710108503,0.0917704510947984)),. Name = c(" V1"," V2",&& #34; V3"," V4"),class =" data.frame",row.names = c(NA,-18L))
谢谢大家 干杯
答案 0 :(得分:1)
正如已经评论过的,在您的数据中,每个样本和具有不同se的基因有三个相同的logFPKM
值。因此,您可以尝试使用交互独立地绘制每个值,例如:
library(tidyverse)
both_long %>%
group_by(gene_id, sample) %>%
mutate(sample2=interaction(1:n(),sample)) %>%
ggplot(aes(x=sample2,y=logFPKM,fill=factor(gene_id), ymax=logFPKM+se, ymin=logFPKM-se)) +
geom_col(position = "dodge") +
geom_errorbar(position = "dodge")