订购geom_segment图表时出现问题

时间:2017-07-26 09:37:06

标签: r ggplot2

我很感激我的情节建议 - 我是一个ggplot新手!

我正在尝试创建一个由群集分割的克利夫兰点图,它有3个级别。我有3个问题,我正在努力:

  1. 在每个群集中,我希望通过连续的x-var对点进行排序。以下代码未正确排序。

  2. 是否可以根据y-var是以0(没有特征)还是1(具有特征)结束来改变点类型?

  3. 我的数据集(人口)中有一个变量,它显示了一个特征的总体百分比。我想看看与群体相比,群集特征是否过度/不足。我想在每个y-var的同一行添加一个点。

  4. 这是我的代码:

    ggplot(cl1, aes(x=Cluster_prop, y=reorder(Var, Cluster_prop)))+
      geom_segment(aes(yend=Var), xend=0, colour="grey50")+
      geom_point(size=3, aes(colour=Cluster))+
      facet_grid(Cluster~., scales="free_y", space="free_y") +
      ggtitle("Top 10 Cluster Characteristics: % Children Within Cluster With 
    Feature") 
    

    这是我的数据:

    > dput(cl1)
    structure(list(Var = structure(c(2L, 3L, 5L, 7L, 14L, 16L, 18L, 
    19L, 20L, 22L, 15L, 9L, 7L, 6L, 21L, 13L, 17L, 12L, 4L, 11L, 
    15L, 17L, 21L, 1L, 13L, 4L, 10L, 12L, 6L, 8L), .Label = c("asthdoc_1", 
    "AttacksOnExer_1_0", "AttacksTTT_1_0", "AttacksTTT_1_1", "Breath0rmal_1_0", 
    "Breath0rmal_1_1", "CAsthmaMed_1_0", "CAsthmaMed_1_1", "CCurrentAsthma_1_0", 
    
    "CCurrentAsthma_1_1", "CongColds_1_1", "CoughNight_1_1", 
    "CoughWithColds_1_1", 
    "EverWheeze_1_0", "EverWheeze_1_1", "Wheeze6M_1_0", "Wheeze6M_1_1", 
    "WheezeMostDays_1_0", "WheezeOcc_1_0", "WheezeWithColds_1_0", 
    "WheezeWithColds_1_1", "WheezeWithShort_1_0"), class = "factor"), 
        Cluster_prop = c(100, 100, 100, 100, 100, 100, 100, 100, 
        100, 100, 100, 99.4219653, 98.8439306, 95.3757225, 94.7976879, 
        83.2369942, 79.1907514, 53.7572254, 50.867052, 50.867052, 
        100, 100, 100, 93.103448, 89.655172, 86.206897, 86.206897, 
        82.758621, 79.310345, 79.310345), Population = c(96.131528, 
        78.143133, 63.636364, 95.16441, 60.928433, 67.891683, 97.485493, 
        89.555126, 62.669246, 90.32882, 39.071567, 94.584139, 95.16441, 
        36.363636, 37.330754, 68.665377, 32.108317, 43.520309, 21.856867, 
        42.166344, 39.071567, 32.108317, 37.330754, 9.864603, 68.665377, 
        21.856867, 5.415861, 43.520309, 36.363636, 4.83559), Cluster = 
    structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1", 
    "2", "3"), class = "factor")), .Names = c("Var", "Cluster_prop", 
    "Population", "Cluster"), row.names = c(NA, -30L), vars = "Cluster", drop = 
    TRUE, indices = list(
    0:9, 10:19, 20:29), group_sizes = c(10L, 10L, 10L), biggest_group_size = 
    10L, labels = structure(list(
    Cluster = 1:3), row.names = c(NA, -3L), class = "data.frame", vars = 
    "Cluster", drop = TRUE, .Names = "Cluster"), class = c("grouped_df", 
    "tbl_df", "tbl", "data.frame"))
    

    非常感谢任何建议!

    enter image description here

1 个答案:

答案 0 :(得分:1)

对于您的第二个( EDIT 和第三个)问题:

library(tidyverse)
library(stringr)
str_sub(str, start = -1, end = -1)

cl2 <- cl1 %>% mutate(Shape = str_sub(Var, start = -1, end = -1))


ggplot(cl2, aes(x=Cluster_prop, y=reorder(Var, Cluster_prop)))+
  geom_segment(aes(yend=Var), xend=0, colour="grey50")+
  geom_point(size=3, aes(colour=Cluster, shape = Shape))+
  geom_point(aes(x = Population), size = 2, color = "black")+
  facet_grid(Cluster~., scales="free_y", space="free_y") +
  ggtitle("Top 10 Cluster Characteristics: % Children Within Cluster With 
          Feature") 

enter image description here