我正在使用ggplot创建散点图
mydata <- read.table('CF1_deNovoAssembly.csv', sep=",",hader=TRUE)
ggplot(mydata, aes(log(Consensus.length), log(Average.coverage))) + geom_point()
CF1_deNovoAssembly.csv中的数据:
Name Consensus length Total read count Single reads Reads in pairs Average coverage
CF1_seqReads contig 1 mapping 81148 77393 45653 31740 68.39
CF1_seqReads contig 2 mapping 5175 4154 2526 1628 57.33
CF1_seqReads contig 3 mapping 43676 43232 25550 17682 70.951
CF1_seqReads contig 4 mapping 33156 28321 16619 11702 61.458
CF1_seqReads contig 5 mapping 194560 158576 93416 65160 58.476
CF1_seqReads contig 6 mapping 26990 27221 16183 11038 72.267
CF1_seqReads contig 7 mapping 35155 34449 20227 14222 70.2
CF1_seqReads contig 8 mapping 110217 111889 65611 46278 73.075
CF1_seqReads contig 9 mapping 96757 87785 51431 36354 65.275
CF1_seqReads contig 10 mapping 169489 155776 91690 64086 65.993
CF1_seqReads contig 11 mapping 280769 215666 126964 88702 55.204
CF1_seqReads contig 12 mapping 29819 30563 17993 12570 73.624
CF1_seqReads contig 13 mapping 120801 116090 68428 47662 69.046
CF1_seqReads contig 14 mapping 172189 154880 91940 62940 64.499
CF1_seqReads contig 15 mapping 105798 88828 52338 36490 60.352
CF1_seqReads contig 16 mapping 212719 200557 117997 82560 67.748
CF1_seqReads contig 17 mapping 36352 29426 17354 12072 57.996
CF1_seqReads contig 18 mapping 1468 2594 1622 972 126.813
CF1_seqReads contig 19 mapping 123801 121038 71234 49804 70.139
CF1_seqReads contig 20 mapping 231369 226726 133732 92994 70.348
CF1_seqReads contig 21 mapping 125419 110004 64774 45230 62.915
CF1_seqReads contig 22 mapping 125818 113356 67034 46322 64.733
CF1_seqReads contig 23 mapping 53872 50388 29824 20564 67.235
CF1_seqReads contig 24 mapping 118273 99252 58798 40454 60.263
CF1_seqReads contig 25 mapping 5569 19834 11758 8076 257.753
CF1_seqReads contig 26 mapping 48830 47879 28265 19614 70.306
CF1_seqReads contig 27 mapping 33566 32370 19280 13090 69.097
CF1_seqReads contig 28 mapping 8357 6684 4046 2638 56.178
CF1_seqReads contig 29 mapping 82328 71998 42670 29328 62.916
CF1_seqReads contig 30 mapping 55288 52415 31023 21392 68.03
CF1_seqReads contig 31 mapping 49849 44216 26142 18074 63.699
CF1_seqReads contig 32 mapping 66991 69598 41202 28396 74.615
CF1_seqReads contig 33 mapping 210958 187922 110992 76930 63.938
CF1_seqReads contig 34 mapping 95028 86002 51080 34922 64.925
CF1_seqReads contig 35 mapping 25219 22685 13567 9118 65.146
CF1_seqReads contig 36 mapping 52506 44863 26493 18370 61.281
CF1_seqReads contig 37 mapping 44807 37939 22745 15194 60.863
CF1_seqReads contig 38 mapping 30091 25919 15355 10564 62.312
CF1_seqReads contig 39 mapping 49730 42295 25445 16850 60.872
CF1_seqReads contig 40 mapping 35166 27239 16101 11138 55.456
CF1_seqReads contig 41 mapping 58239 54831 32311 22520 67.764
CF1_seqReads contig 42 mapping 78398 69994 41578 28416 64.135
CF1_seqReads contig 43 mapping 79163 61667 36637 25030 55.958
CF1_seqReads contig 44 mapping 46179 37621 22479 15142 58.463
CF1_seqReads contig 45 mapping 1501 1209 715 494 55.69
CF1_seqReads contig 46 mapping 35505 36158 21296 14862 73.271
CF1_seqReads contig 47 mapping 108945 100876 59394 41482 66.479
CF1_seqReads contig 48 mapping 36042 30283 17961 12322 60.289
CF1_seqReads contig 49 mapping 125139 102821 60441 42380 59.021
CF1_seqReads contig 50 mapping 33093 31998 18976 13022 69.715
CF1_seqReads contig 51 mapping 19399 14764 8826 5938 54.607
CF1_seqReads contig 52 mapping 39627 30320 17856 12464 54.848
CF1_seqReads contig 53 mapping 12163 9861 5887 3974 58.008
CF1_seqReads contig 54 mapping 4378 3872 2442 1430 62.841
CF1_seqReads contig 55 mapping 107763 96191 56993 39198 64.165
CF1_seqReads contig 56 mapping 167629 143032 84032 59000 61.441
CF1_seqReads contig 57 mapping 97622 80176 47622 32554 58.829
CF1_seqReads contig 58 mapping 56912 56028 32850 23178 70.506
CF1_seqReads contig 59 mapping 15390 16360 9792 6568 76.745
CF1_seqReads contig 60 mapping 80202 71909 42337 29572 64.292
CF1_seqReads contig 61 mapping 45435 39732 23290 16442 62.592
CF1_seqReads contig 62 mapping 17972 15752 9208 6544 63.102
CF1_seqReads contig 63 mapping 41256 40603 23859 16744 70.545
CF1_seqReads contig 64 mapping 110461 93608 54796 38812 60.845
CF1_seqReads contig 65 mapping 62066 53798 31662 22136 62.125
CF1_seqReads contig 66 mapping 1981 1788 1112 676 63.459
CF1_seqReads contig 67 mapping 32249 28939 17121 11818 64.486
CF1_seqReads contig 68 mapping 30129 30299 17873 12426 72.002
CF1_seqReads contig 69 mapping 73494 70081 41307 28774 68.502
CF1_seqReads contig 70 mapping 42147 32350 19106 13244 54.965
CF1_seqReads contig 71 mapping 15109 14803 8827 5976 70.037
CF1_seqReads contig 72 mapping 19446 17197 10277 6920 63.506
CF1_seqReads contig 73 mapping 1203 2160 1410 750 127.011
CF1_seqReads contig 74 mapping 35575 31557 18907 12650 63.833
CF1_seqReads contig 75 mapping 61658 52593 31031 21562 61.218
CF1_seqReads contig 76 mapping 2104 2063 1335 728 69.914
CF1_seqReads contig 77 mapping 58182 49734 29348 20386 61.311
CF1_seqReads contig 78 mapping 55182 54095 32319 21776 70.398
CF1_seqReads contig 79 mapping 35523 34002 19964 14038 68.577
CF1_seqReads contig 80 mapping 5174 8766 5222 3544 119.842
CF1_seqReads contig 81 mapping 69777 59263 35069 24194 60.855
CF1_seqReads contig 82 mapping 23575 21660 12872 8788 65.608
CF1_seqReads contig 83 mapping 3065 2609 1597 1012 61.1
CF1_seqReads contig 84 mapping 332 803 619 184 171.226
CF1_seqReads contig 85 mapping 5538 5060 3028 2032 63.651
CF1_seqReads contig 86 mapping 18727 16636 9814 6822 63.747
CF1_seqReads contig 87 mapping 27818 21227 12585 8642 54.79
CF1_seqReads contig 88 mapping 20439 17310 10266 7044 60.577
CF1_seqReads contig 89 mapping 14937 13026 7656 5370 62.693
CF1_seqReads contig 90 mapping 17570 16529 9787 6742 67.656
CF1_seqReads contig 91 mapping 7927 7372 4374 2998 66.942
CF1_seqReads contig 92 mapping 2695 5155 3143 2012 136
CF1_seqReads contig 93 mapping 28431 22662 13382 9280 57.128
CF1_seqReads contig 94 mapping 10910 8378 5032 3346 54.889
CF1_seqReads contig 95 mapping 11426 11337 6863 4474 70.898
CF1_seqReads contig 96 mapping 39433 36586 21812 14774 66.563
CF1_seqReads contig 97 mapping 65815 66239 39289 26950 72.083
CF1_seqReads contig 98 mapping 11296 11627 6991 4636 73.84
CF1_seqReads contig 99 mapping 27785 22040 13130 8910 56.893
CF1_seqReads contig 100 mapping 26131 20073 11793 8280 55.234
CF1_seqReads contig 101 mapping 825 766 560 206 61.246
CF1_seqReads contig 102 mapping 25869 25524 15286 10238 70.695
CF1_seqReads contig 103 mapping 7747 7244 4356 2888 66.154
CF1_seqReads contig 104 mapping 34292 28755 16913 11842 60.05
CF1_seqReads contig 105 mapping 17219 16000 9346 6654 66.858
CF1_seqReads contig 106 mapping 39990 34798 20590 14208 62.384
CF1_seqReads contig 107 mapping 38227 33283 19721 13562 62.381
CF1_seqReads contig 108 mapping 1825 1439 919 520 54.89
CF1_seqReads contig 109 mapping 5333 4212 2494 1718 57.046
CF1_seqReads contig 110 mapping 13827 11248 6582 4666 58.276
CF1_seqReads contig 111 mapping 25486 22477 13277 9200 63.393
CF1_seqReads contig 112 mapping 15592 13751 8295 5456 63.048
CF1_seqReads contig 113 mapping 6230 4864 2986 1878 55.995
CF1_seqReads contig 114 mapping 28229 22164 13150 9014 56.051
CF1_seqReads contig 115 mapping 92951 92630 54674 37956 71.557
CF1_seqReads contig 116 mapping 24347 24204 14532 9672 71.386
CF1_seqReads contig 117 mapping 11556 11295 6657 4638 70.199
CF1_seqReads contig 118 mapping 2750 2553 1683 870 64.722
CF1_seqReads contig 119 mapping 19046 14586 8706 5880 54.681
CF1_seqReads contig 120 mapping 19966 17390 10290 7100 62.622
CF1_seqReads contig 121 mapping 1912 1657 1011 646 62.048
CF1_seqReads contig 122 mapping 1236 5497 3435 2062 318.75
CF1_seqReads contig 123 mapping 1136 852 584 268 53.619
CF1_seqReads contig 124 mapping 414 391 273 118 62.2
CF1_seqReads contig 125 mapping 912 931 619 312 72.031
CF1_seqReads contig 126 mapping 915 588 408 180 43.635
CF1_seqReads contig 127 mapping 2039 1853 1165 688 64.089
CF1_seqReads contig 128 mapping 1471 1253 837 416 58.997
CF1_seqReads contig 129 mapping 1148 2382 1560 822 147.665
CF1_seqReads contig 130 mapping 23233 23367 14443 8924 71.842
CF1_seqReads contig 131 mapping 702 472 324 148 47.107
CF1_seqReads contig 132 mapping 855 1461 967 494 120.706
CF1_seqReads contig 133 mapping 461 1027 725 302 157.434
CF1_seqReads contig 134 mapping 1136 834 580 254 52.482
CF1_seqReads contig 135 mapping 1222 1681 1131 550 98.43
CF1_seqReads contig 136 mapping 1316 997 689 308 53.191
CF1_seqReads contig 137 mapping 1923 1880 1204 676 68.222
CF1_seqReads contig 138 mapping 903 601 401 200 47.503
CF1_seqReads contig 139 mapping 604 495 367 128 56.925
CF1_seqReads contig 140 mapping 1854 1651 1081 570 62.929
CF1_seqReads contig 141 mapping 857 1666 1114 552 137.351
CF1_seqReads contig 142 mapping 273 264 214 50 65.048
CF1_seqReads contig 143 mapping 1848 1254 826 428 47.48
CF1_seqReads contig 144 mapping 9112 8829 5223 3606 69.287
CF1_seqReads contig 145 mapping 4959 8350 5042 3308 120.352
CF1_seqReads contig 146 mapping 1160 2386 1570 816 147.567
CF1_seqReads contig 147 mapping 3398 2919 1807 1112 59.74
CF1_seqReads contig 148 mapping 513 491 381 110 65.774
CF1_seqReads contig 149 mapping 2634 2644 1594 1050 71.279
CF1_seqReads contig 150 mapping 2333 1832 1086 746 54.456
CF1_seqReads contig 151 mapping 9929 8130 4910 3220 58.649
CF1_seqReads contig 152 mapping 4867 4591 2765 1826 66.831
CF1_seqReads contig 153 mapping 2244 1984 1278 706 61.906
CF1_seqReads contig 154 mapping 3008 2557 1581 976 61.333
CF1_seqReads contig 155 mapping 553 1015 733 282 130.448
CF1_seqReads contig 156 mapping 735 974 662 312 91.188
CF1_seqReads contig 157 mapping 1375 2157 1507 650 110.765
CF1_seqReads contig 158 mapping 211 168 160 8 54.796
CF1_seqReads contig 159 mapping 211 174 160 14 56.749
CF1_seqReads contig 160 mapping 3076 3113 1855 1258 73.188
CF1_seqReads contig 161 mapping 1965 1474 998 476 51.869
CF1_seqReads contig 162 mapping 2495 2055 1301 754 57.74
CF1_seqReads contig 163 mapping 230 201 183 18 59.178
CF1_seqReads contig 164 mapping 899 1786 1176 610 140.673
CF1_seqReads contig 165 mapping 3860 2683 1643 1040 49.358
CF1_seqReads contig 166 mapping 1207 1064 642 422 62.839
CF1_seqReads contig 167 mapping 6068 5769 3555 2214 67.996
CF1_seqReads contig 168 mapping 1345 980 628 352 51.059
CF1_seqReads contig 169 mapping 2407 2119 1233 886 62.073
CF1_seqReads contig 170 mapping 236 409 359 50 119.915
CF1_seqReads contig 171 mapping 2288 1959 1229 730 61.018
CF1_seqReads contig 172 mapping 1214 715 497 218 40.74
CF1_seqReads contig 173 mapping 323 531 431 100 113.607
CF1_seqReads contig 174 mapping 1222 789 529 260 44.583
CF1_seqReads contig 175 mapping 207 188 182 6 61.063
CF1_seqReads contig 176 mapping 2236 2204 1392 812 70.699
CF1_seqReads contig 177 mapping 1173 1189 901 288 70.116
CF1_seqReads contig 178 mapping 757 692 476 216 62.54
CF1_seqReads contig 179 mapping 238 485 413 72 137.378
CF1_seqReads contig 180 mapping 1122 984 670 314 62.156
CF1_seqReads contig 181 mapping 1717 1305 819 486 53.286
CF1_seqReads contig 182 mapping 739 1061 825 236 101.298
CF1_seqReads contig 183 mapping 377 293 231 62 54.255
CF1_seqReads contig 184 mapping 878 837 589 248 67.145
CF1_seqReads contig 185 mapping 905 786 540 246 60.841
CF1_seqReads contig 186 mapping 321 223 189 34 44.969
CF1_seqReads contig 187 mapping 215 251 221 30 77.498
CF1_seqReads contig 188 mapping 1153 1074 718 356 64.892
CF1_seqReads contig 189 mapping 568 441 303 138 53.771
CF1_seqReads contig 190 mapping 582 450 282 168 54.89
CF1_seqReads contig 191 mapping 452 767 585 182 119.653
CF1_seqReads contig 192 mapping 263 218 186 32 58.73
CF1_seqReads contig 193 mapping 313 247 193 54 54.22
CF1_seqReads contig 194 mapping 295 214 174 40 48.346
CF1_seqReads contig 195 mapping 297 197 145 52 47.007
CF1_seqReads contig 196 mapping 346 230 180 50 42.566
CF1_seqReads contig 197 mapping 392 226 180 46 37.457
CF1_seqReads contig 198 mapping 208 168 150 18 53.255
CF1_seqReads contig 199 mapping 660 586 398 188 62.903
CF1_seqReads contig 200 mapping 276 300 250 50 72.681
CF1_seqReads contig 201 mapping 388 269 231 38 45.611
CF1_seqReads contig 202 mapping 353 343 245 98 67.042
CF1_seqReads contig 203 mapping 284 175 139 36 42.144
并且看着y轴我可以注意到有3组点。
是否有算法在不使用max和/或min y值的情况下识别每个组?
答案 0 :(得分:4)
如果您想使用某些预设值对y
进行分组,那么您可以使用cut
可重现的例子
set.seed(07122012)
DF <- data.frame(y= runif(100), x = rnorm(100))
# grouping at 0.33 / 0.66
mygroups <- seq(0,1,l=4)
ggplot(DF, aes(x=x,y=y)) + geom_point(aes(colour= cut(y,breaks = mygroups))) +
scale_colour_brewer('My groups', palette = 'Set2')
或者你可以做一些简单的聚类(可能是x和y上scale
和kmeans
的组合)
ggplot(DF, aes(x=x,y=y)) +
geom_point(aes(colour= factor(kmeans(scale(cbind(x,y)), centers=3)$cluster))) +
scale_colour_brewer('My groups', palette = 'Set2')