我有一些数据,我正在创建一个使用ggplot2的堆积条形图。对于x轴上的每个样本,另外我每个样本也有三个因子。我想在条形图和样本名称之间根据因子绘制每个样本的不同颜色的正方形。由于每个样本有三个因素,我想绘制这些是三行正方形,有点像华夫饼图。下面的数据显示了我对每个样本的三个因素 - “tissue_type”,“biopsy_type”和“gleason_score”。有什么方法可以将所有这些一起绘制出来吗?堆积的条形图和一种华夫饼图?
数据:
.collapse
我目前如何使用ggplot
制作堆积条形图> total
aberration_type Freq sample_name tissue_type biopsy_type
1 homozygous_loss 42 160078-N_S16_L001_R1_001 Normal Normal
2 heterozygous_loss 200 160078-N_S16_L001_R1_001 Normal Normal
3 homozygous_loss 56 160078-T_S17_L001_R1_001 Tumour Repeat
4 heterozygous_loss 1917 160078-T_S17_L001_R1_001 Tumour Repeat
5 gain 666 160078-T_S17_L001_R1_001 Tumour Repeat
6 homozygous_loss 42 160079-N_S7_L001_R1_001 Normal Normal
7 heterozygous_loss 78 160079-N_S7_L001_R1_001 Normal Normal
8 homozygous_loss 193 160079-T_S8_L001_R1_001 Tumour Repeat
9 heterozygous_loss 4336 160079-T_S8_L001_R1_001 Tumour Repeat
10 gain 129 160079-T_S8_L001_R1_001 Tumour Repeat
11 homozygous_loss 42 160080-N_S20_L001_R1_001 Normal Normal
12 heterozygous_loss 78 160080-N_S20_L001_R1_001 Normal Normal
13 homozygous_loss 42 160081-N_S21_L001_R1_001 Normal Normal
14 heterozygous_loss 76 160081-N_S21_L001_R1_001 Normal Normal
15 homozygous_loss 42 160081-T_S22_L001_R1_001 Tumour Repeat
16 heterozygous_loss 1191 160081-T_S22_L001_R1_001 Tumour Repeat
17 gain 59 160081-T_S22_L001_R1_001 Tumour Repeat
18 homozygous_loss 42 160082-N_S23_L001_R1_001 Normal Normal
19 heterozygous_loss 6 160082-N_S23_L001_R1_001 Normal Normal
20 homozygous_loss 42 160083-N_S24_L001_R1_001 Normal Normal
21 heterozygous_loss 6 160083-N_S24_L001_R1_001 Normal Normal
22 homozygous_loss 42 160083-T_S25_L001_R1_001 Tumour Repeat
23 heterozygous_loss 515 160083-T_S25_L001_R1_001 Tumour Repeat
24 gain 88 160083-T_S25_L001_R1_001 Tumour Repeat
25 homozygous_loss 42 160084-N_S26_L001_R1_001 Normal Normal
26 heterozygous_loss 79 160084-N_S26_L001_R1_001 Normal Normal
27 homozygous_loss 42 160084-T_S27_L001_R1_001 Tumour Initial
28 heterozygous_loss 671 160084-T_S27_L001_R1_001 Tumour Initial
29 gain 56 160084-T_S27_L001_R1_001 Tumour Initial
30 homozygous_loss 42 160088-N_S5_L001_R1_001 Normal Normal
31 heterozygous_loss 63 160088-N_S5_L001_R1_001 Normal Normal
32 homozygous_loss 42 160088-T_S6_L001_R1_001 Tumour Initial
33 heterozygous_loss 6 160088-T_S6_L001_R1_001 Tumour Initial
34 homozygous_loss 42 160089-N_S28_L001_R1_001 Normal Normal
35 heterozygous_loss 114 160089-N_S28_L001_R1_001 Normal Normal
36 homozygous_loss 113 160089-T_S29_L001_R1_001 Tumour Repeat
37 heterozygous_loss 4196 160089-T_S29_L001_R1_001 Tumour Repeat
38 gain 8 160089-T_S29_L001_R1_001 Tumour Repeat
39 homozygous_loss 42 160090-N_S13_L001_R1_001 Normal Normal
40 heterozygous_loss 75 160090-N_S13_L001_R1_001 Normal Normal
41 homozygous_loss 42 160091-N_S14_L001_R1_001 Normal Normal
42 heterozygous_loss 74 160091-N_S14_L001_R1_001 Normal Normal
43 homozygous_loss 42 160091-T_S15_L001_R1_001 Tumour Repeat
44 heterozygous_loss 194 160091-T_S15_L001_R1_001 Tumour Repeat
45 homozygous_loss 41 160093-N_S9_L001_R1_001 Normal Normal
46 heterozygous_loss 6 160093-N_S9_L001_R1_001 Normal Normal
47 homozygous_loss 42 160093-T_S10_L001_R1_001 Tumour Initial
48 heterozygous_loss 1034 160093-T_S10_L001_R1_001 Tumour Initial
49 homozygous_loss 42 160094-N_S11_L001_R1_001 Normal Normal
50 heterozygous_loss 77 160094-N_S11_L001_R1_001 Normal Normal
51 homozygous_loss 42 160094-T_S12_L001_R1_001 Tumour Repeat
52 heterozygous_loss 2192 160094-T_S12_L001_R1_001 Tumour Repeat
53 gain 10 160094-T_S12_L001_R1_001 Tumour Repeat
54 homozygous_loss 42 160095-N_S1_L001_R1_001 Normal Normal
55 heterozygous_loss 76 160095-N_S1_L001_R1_001 Normal Normal
56 homozygous_loss 41 160095-T_S2_L001_R1_001 Tumour Initial
57 heterozygous_loss 442 160095-T_S2_L001_R1_001 Tumour Initial
58 homozygous_loss 42 160096-N_S4_L001_R1_001 Normal Normal
59 heterozygous_loss 6 160096-N_S4_L001_R1_001 Normal Normal
60 homozygous_loss 42 160096-T_S4_L001_R1_001 Tumour Repeat
61 heterozygous_loss 484 160096-T_S4_L001_R1_001 Tumour Repeat
62 homozygous_loss 42 160098-N_S4_L001_R1_001 Normal Normal
63 heterozygous_loss 68 160098-N_S4_L001_R1_001 Normal Normal
64 homozygous_loss 42 160098-T_S4_L001_R1_001 Tumour Initial
65 heterozygous_loss 598 160098-T_S4_L001_R1_001 Tumour Initial
gleason_score
1 Normal
2 Normal
3 3_4
4 3_4
5 3_4
6 Normal
7 Normal
8 3_3
9 3_3
10 3_3
11 Normal
12 Normal
13 Normal
14 Normal
15 3_3
16 3_3
17 3_3
18 Normal
19 Normal
20 Normal
21 Normal
22 3_3
23 3_3
24 3_3
25 Normal
26 Normal
27 3_3
28 3_3
29 3_3
30 Normal
31 Normal
32 3_3
33 3_3
34 Normal
35 Normal
36 3_4
37 3_4
38 3_4
39 Normal
40 Normal
41 Normal
42 Normal
43 3_3
44 3_3
45 Normal
46 Normal
47 3_3
48 3_3
49 Normal
50 Normal
51 3_4
52 3_4
53 3_4
54 Normal
55 Normal
56 3_3
57 3_3
58 Normal
59 Normal
60 3_4
61 3_4
62 Normal
63 Normal
64 3_3
65 3_3
答案 0 :(得分:1)
我确信这是可能的 - 但总的来说, faceting 是一种通过不同因素可视化数据的好方法。这是一个初步尝试。它使样品标签有点拥挤,条形不太清晰(至少在这个小版本中),但确实说明了主要发现:即肿瘤活检中杂合性损失较高。
ggplot(total, aes(x = reorder(sample_name, -Freq),
y = Freq,
fill = aberration_type)) +
geom_col() +
theme(axis.text.x = element_text(angle = 45, hjust = 1, size=5)) +
labs(title = "Frequency aberrant bins",
x = "Sample Name",
y = "Frequency") +
facet_grid(biopsy_type ~ tissue_type + gleason_score)
对于更清晰的图表,您可以使用较少的因素例如。只是活组织检查类型:
+ facet_grid(biopsy_type ~ .)