我想使用geom_dotplot通过点的形状和两个带有颜色的类别来区分两个不同的变量。例如,我有这个数据集:
Position phaseGroup phaseGroup_2 phaseGroup_3 Synonymous Information Phasing Pha_Syn Grouped yPos
5.85E+04 1 1 1 16 1.1 Phased Phased-16 1 1
5.96E+04 1 1 1 16 1.1 Phased Phased-16 1 2
6.22E+04 1 1 1 16 1.1 Phased Phased-16 1 3
6.26E+04 1 1 1 16 1.1 Phased Phased-16 1 4
7.22E+04 NA 1 1 16 1.NA Unphased Unphased-16 1 5
7.30E+04 1 1 1 16 1.1 Phased Phased-16 1 6
2.03E+05 2 2 2.1 16 1.2 Phased Phased-16 1 7
2.48E+05 2 2 2.1 1 1.2 Phased Phased-1 1 8
2.53E+05 2 2 2.1 16 1.2 Phased Phased-16 1 9
2.53E+05 2 2 2.1 16 1.2 Phased Phased-16 1 10
2.54E+05 2 2 2.1 16 1.2 Phased Phased-16 1 11
2.54E+05 2 2 2.1 16 1.2 Phased Phased-16 1 12
2.54E+05 NA 2 2.2 16 1.NA Unphased Unphased-16 1 13
2.55E+05 2 2 2.2 16 1.2 Phased Phased-16 1 14
2.56E+05 2 2 2.2 16 1.2 Phased Phased-16 1 15
2.58E+05 2 2 2.2 16 1.2 Phased Phased-16 1 16
2.61E+05 2 2 2.2 16 1.2 Phased Phased-16 1 17
5.76E+05 3 3 3.1 16 1.3 Phased Phased-16 1 18
5.77E+05 3 3 3.1 16 1.3 Phased Phased-16 1 19
5.77E+05 3 3 3.1 16 1.3 Phased Phased-16 1 20
5.79E+05 3 3 3.1 16 1.3 Phased Phased-16 1 21
6.22E+05 3 3 3.1 16 1.3 Phased Phased-16 1 22
6.23E+05 3 3 3.1 1 1.3 Phased Phased-1 1 23
6.24E+05 3 3 3.2 16 1.3 Phased Phased-16 1 24
6.25E+05 3 3 3.2 16 1.3 Phased Phased-16 1 25
6.26E+05 3 3 3.2 16 1.3 Phased Phased-16 1 26
6.27E+05 3 3 3.2 16 1.3 Phased Phased-16 1 27
6.29E+05 3 3 3.2 16 1.3 Phased Phased-16 1 28
6.29E+05 3 3 3.2 16 1.3 Phased Phased-16 1 29
6.31E+05 3 3 3.3 16 1.3 Phased Phased-16 1 30
6.32E+05 3 3 3.3 16 1.3 Phased Phased-16 1 31
6.32E+05 3 3 3.3 16 1.3 Phased Phased-16 1 32
6.34E+05 3 3 3.3 16 1.3 Phased Phased-16 1 33
6.35E+05 3 3 3.3 16 1.3 Phased Phased-16 1 34
6.37E+05 3 3 3.3 16 1.3 Phased Phased-16 1 35
6.76E+05 3 3 3.4 16 1.3 Phased Phased-16 1 36
6.82E+05 3 3 3.4 16 1.3 Phased Phased-16 1 37
7.40E+05 3 3 3.4 16 1.3 Phased Phased-16 1 38
7.57E+05 3 3 3.4 16 1.3 Phased Phased-16 1 39
7.60E+05 3 3 3.4 16 1.3 Phased Phased-16 1 40
7.61E+05 3 3 3.4 16 1.3 Phased Phased-16 1 41
7.61E+05 3.5 3.5 3.5 16 2.1 Phased Phased-16 2 41.5
2.03E+06 4 4 4 16 3.4 Phased Phased-16 3 42
2.10E+06 4 4 4 1 3.4 Phased Phased-1 3 43
2.15E+06 4 4 4 16 3.4 Phased Phased-16 3 44
2.16E+06 4 4 4 16 3.4 Phased Phased-16 3 45
2.16E+06 4 4 4 16 3.4 Phased Phased-16 3 46
2.16E+06 4 4 4 16 3.4 Phased Phased-16 3 47
2.17E+06 4 4 4 1 3.4 Phased Phased-1 3 48
2.18E+06 NA 4 4 1 3.NA Unphased Unphased-1 3 49
2.36E+06 5 5 5 16 3.5 Phased Phased-16 3 50
2.36E+06 5 5 5 16 3.5 Phased Phased-16 3 51
2.37E+06 5 5 5 16 3.5 Phased Phased-16 3 52
2.37E+06 5 5 5 1 3.5 Phased Phased-1 3 53
2.37E+06 5 5 5 1 3.5 Phased Phased-1 3 54
2.37E+06 5 5 5 16 3.5 Phased Phased-16 3 55
2.37E+06 5 5 5 16 3.5 Phased Phased-16 3 56
2.37E+06 5 5 5 1 3.5 Phased Phased-1 3 57
2.50E+06 5 5 5 16 3.5 Phased Phased-16 3 58
2.50E+06 5 5 5 16 3.5 Phased Phased-16 3 59
2.53E+06 5 5 5 1 3.5 Phased Phased-1 3 60
2.54E+06 5 5 5 16 3.5 Phased Phased-16 3 61
2.54E+06 5 5 5 1 3.5 Phased Phased-1 3 62
2.56E+06 5 5 5 16 3.5 Phased Phased-16 3 63
2.60E+06 5 5 5 16 3.5 Phased Phased-16 3 64
2.62E+06 5 5 5 16 3.5 Phased Phased-16 3 65
3.04E+06 NA 5 5 1 4.NA Unphased Unphased-1 4 66
3.17E+06 NA 5 5 1 4.NA Unphased Unphased-1 4 67
3.84E+06 NA 5 5 16 4.NA Unphased Unphased-16 4 68
4.00E+06 6 6 6 16 5.6 Phased Phased-16 5 69
4.00E+06 6 6 6 16 5.6 Phased Phased-16 5 70
4.00E+06 6 6 6 16 5.6 Phased Phased-16 5 71
4.00E+06 6 6 6 1 5.6 Phased Phased-1 5 72
4.00E+06 6 6 6 16 5.6 Phased Phased-16 5 73
4.00E+06 NA 6 6 16 5.NA Unphased Unphased-16 5 74
4.00E+06 NA 6 6 16 5.NA Unphased Unphased-16 5 75
4.00E+06 7 7 7 16 5.7 Phased Phased-16 5 76
4.00E+06 7 7 7 16 5.7 Phased Phased-16 5 77
4.00E+06 7 7 7 1 5.7 Phased Phased-1 5 78
4.00E+06 7 7 7 1 5.7 Phased Phased-1 5 79
4.00E+06 7 7 7 16 5.7 Phased Phased-16 5 80
4.00E+06 7 7 7 16 5.7 Phased Phased-16 5 81
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 82
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 83
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 84
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 85
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 86
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 87
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 88
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 89
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 90
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 91
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 92
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 93
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 94
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 95
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 96
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 97
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 98
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 99
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 100
4.00E+06 8 8 8 16 5.8 Phased Phased-16 5 101
4.01E+06 8 8 8 16 5.8 Phased Phased-16 5 102
4.01E+06 8 8 8 16 5.8 Phased Phased-16 5 103
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 104
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 105
4.01E+06 NA 9 9 16 5.NA Unphased Unphased-16 5 106
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 107
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 108
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 109
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 110
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 111
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 112
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 113
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 114
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 115
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 116
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 117
4.01E+06 9 9 9 16 5.9 Phased Phased-16 5 118
4.01E+06 NA 9 9 1 5.NA Unphased Unphased-1 5 119
4.02E+06 9 9 9 16 5.9 Phased Phased-16 5 120
4.02E+06 10 10 10 1 5.1 Phased Phased-1 5 121
4.02E+06 10 10 10 16 5.1 Phased Phased-16 5 122
4.02E+06 NA 10 10 1 5.NA Unphased Unphased-1 5 123
4.02E+06 10 10 10 16 5.1 Phased Phased-16 5 124
4.02E+06 10 10 10 16 5.1 Phased Phased-16 5 125
4.02E+06 10 10 10 1 5.1 Phased Phased-1 5 126
4.02E+06 10 10 10 1 5.1 Phased Phased-1 5 127
4.02E+06 10 10 10 1 5.1 Phased Phased-1 5 128
4.02E+06 NA 10 10 1 5.NA Unphased Unphased-1 5 129
4.02E+06 10 10 10 1 5.1 Phased Phased-1 5 130
4.02E+06 10 10 10 1 5.1 Phased Phased-1 5 131
4.02E+06 10 10 10 16 5.1 Phased Phased-16 5 132
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 133
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 134
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 135
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 136
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 137
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 138
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 139
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 140
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 141
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 142
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 143
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 144
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 145
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 146
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 147
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 148
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 149
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 150
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 151
4.03E+06 NA 10 10 16 5.NA Phased Phased-16 5 152
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 153
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 154
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 155
4.03E+06 10 10 10 16 5.1 Phased Phased-16 5 156
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 157
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 158
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 159
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 160
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 161
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 162
4.04E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 163
4.04E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 164
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 165
4.04E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 166
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 167
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 168
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 169
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 170
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 171
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 172
4.04E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 173
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 174
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 175
4.04E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 176
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 177
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 178
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 179
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 180
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 181
4.04E+06 10 10 10 1 5.1 Phased Phased-1 5 182
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 183
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 184
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 185
4.04E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 186
4.04E+06 10 10 10 16 5.1 Phased Phased-16 5 187
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 188
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 189
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 190
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 191
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 192
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 193
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 194
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 195
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 196
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 197
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 198
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 199
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 200
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 201
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 202
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 203
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 204
4.05E+06 10 10 10 16 5.1 Phased Phased-16 5 205
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 206
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 207
4.05E+06 10 10 10 1 5.1 Phased Phased-1 5 208
4.59E+06 NA 10 10 16 5.NA Unphased Unphased-16 5 209
我运行以下命令行:
library(ggplot2)
ggplot(A, aes(x = factor(Grouped), y = phaseGroup_2, fill = factor(Phasing))) +
geom_dotplot(binaxis="y",stackdir = "center", dotsize = 0.3) +
theme_minimal()
并生成了附件中的图形。
但是,我想在此图中做两件事,因为您可以看到y = 10处的第5组,我看不到所有点,但是如果我将点缩小,则看不到颜色。其次,我想根据Synonymous
列更改图形的形状(1-实心圆和16-实心圆)。
我的问题是如何改变点的形状并以可以看到所有点的方式优化图形?重要说明是,我想保持水平数据的数量相同,这意味着,正如您在第1组中看到的那样,我有3个水平数据点,而在第2组中只有一个。
有什么方法可以优化这个数字,我是否需要在ggplot中尝试另一个命令行?
答案 0 :(得分:1)
你能做这样的事吗?
library(tidyverse)
df <-
A %>%
mutate(y = ifelse(Grouped == 4, phaseGroup_2 + 0.2, phaseGroup_2))
ggplot(df) +
geom_dotplot(
aes(x = Grouped, y = y, group = Grouped, fill = factor(Phasing))
binaxis = "y",
stackdir = "center",
dotsize = 0.6
) +
scale_x_continuous(breaks = c(1:5), limits = c(-6, 16)) +
theme(
panel.grid.minor = element_blank(),
panel.grid.major.y = element_blank()
)
我注意到phaseGroup_2 == 4
时,您有2个小组并排放置,所以我将该小组撞了一些。
此外,请将来将您的数据集作为代码提供给我们。 datapasta
软件包有一个很好的加载项,可以将剪贴板上的某些内容从excel粘贴到tribble或data.frame。
https://reprex.tidyverse.org/articles/articles/datapasta-reprex.html
答案 1 :(得分:1)
如果我理解正确,则OP正在寻找多个类别数据的可视化文件。 OP尝试使用ggplot2::geom_dotplot()
来可视化4类phaseGroup_2
,Grouped
,Phasing
和Synonymous
的计数。
此答案试图探索不同的方法:
geom_dotplot()
的解决方法ggbeeswarm
vcd
和ggmosaic
geom_dotplot()
的{{1}}的解决方法正如mentioned by aosmith一样,interaction()
并没有意识到geom_dotplot()
的美感。相反,对于OP要求的小点尺寸,使用shape
或colour
美观度并不能给出很好的可分辨结果。
一种解决方法是将alpha
和colour
的美感用于变量fill
和Phasing
的组合(交互)。这两个变量都是二分的。因此,仅需要4种不同的颜色。
Synonymous
library(ggplot2)
ggplot(A) +
aes(x = factor(Grouped),
y = phaseGroup_2,
colour = interaction(Phasing, Synonymous),
fill = interaction(Phasing, Synonymous)) +
geom_dotplot(binaxis = "y",
stackdir = "center",
dotsize = 0.3) +
scale_y_continuous(breaks = unique(A$phaseGroup_2), minor_breaks = NULL) +
scale_colour_brewer(palette = "RdBu",
direction = -1,
aesthetics = c("colour", "fill")) +
expand_limits(x = 6) +
theme_minimal()
使用实心圆,如果仅使用geom_dotplot()
美学,则该实心圆在每个点周围绘制黑色轮廓。为了提高感知度,fill
和colour
使用相同的编码。
但是,并非所有点都是完全可见的,因为有些点由于过度绘图而在其他点之后被掩盖。特别是,fill
情况下的蓝色点几乎不可见。
顺便说一句,可以直接使用OP数据集中的等效列Synonymous == 1
代替interaction(Phasing, Synonymous)
。
进一步的改进包括:
Pha_Syn
中的点被切除(按照OP的要求)。 Grouped == 5
ggbeeswarm
了解与geom_beeswarm()
相同的美学,包括geom_point()
美学。此外,它避免了过度绘图。
shape
可以清楚地区分空心和实心圆。
library(ggbeeswarm)
ggplot(A) +
aes(x = Grouped,
y = phaseGroup_2,
colour = Phasing,
shape = Synonymous) +
geom_beeswarm() +
scale_x_continuous(breaks = unique(A$Grouped), minor_breaks = NULL) +
scale_y_continuous(breaks = unique(A$phaseGroup_2), minor_breaks = NULL) +
scale_shape_identity() +
theme_minimal()
和vcd
ggmosaic
程序包包含许多用于可视化分类数据的图。我尝试使用vcd
包中的mosaic()
,doubledecker()
和tile()
(以及{{1 }}程序包),但不幸的是,由于存在大量的空单元,它们都没有给出令人满意的结果。
vcd
软件包中的ggplot2
函数能够读取OP的数据,而无需额外的人工干预。
ggmosaic