使用具有重复y因子的ggplot在基于geom_segment的甘特图中排序条形图

时间:2015-09-01 11:53:31

标签: r ggplot2 dataframe visualization

问题

我有一个数据集se.df(问题底部的数据),我通过使用ggplotfacet_grid将其可视化为一个因素甘特图。但是,y标签没有按照我指定的aes

进行排序
library(ggplot2)
base <- ggplot(
  se.df,
  aes(
    x = Start.Date, reorder(Action,Start.Date), color = Comms.Type
  ))
base + geom_segment(aes(
  xend = End.Date,ystart = Action, yend = Action
), size = 5) + 
facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)

在此详细图片中,您可以看到有以下栏:

  1. 未在Start.Date订单
  2. 中显示
  3. Action未订购。为了澄清,条形应按Start.Date排序,然后按Action
  4. 按字母顺序排序

    enter image description here

    如何根据Start.Date然后按Action在每个因素中订购条形码?

    更新

    @heathobrien提供了一个解决方案,解决了我Start.Date排序栏的问题,而不是由重复因素引起的问题 - 这是我的实际数据所具有的。

    "Inform colleges"中有Action的两个实例,导致@heathobrien的以下代码中出现错误,在图像中以红色虚线椭圆突出显示:

    se.df <-se.df[order(se.df$Start.Date,se.df$Action),]
    se.df$Action <- factor(se.df$Action, levels=unique(se.df$Action))
    ggplot(se.df, aes(x = Start.Date, color = Comms.Type)) +
      geom_segment(aes(xend = End.Date, y = Action, yend = Action), size = 5) +
      facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)
    

    enter image description here

    如何将此data.frame提供给ggplot,以便在每个facet_grid内排序一致?

    进一步详情

    关于制作甘特图和订购因素有很多问题,我根据其他人的答案做了一些决定:

    1. geom_segment
    2. 许多提问者使用过geom_linerange,但却无法使用coord_flip with non-cartesian coordinate systems。对此的解决方案很复杂,我已使用geom_segment缓解了这些问题。

        reorder
      1. aes

        几乎规范的条形码排序问题uses reorder。但是,这对我的数据不起作用,即使使用transform而不是直接指定aes的顺序。我很乐意找到有效的解决方案。

        数据

        se.df <- structure(list(Source = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
        1L, 1L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L), .Label = c("a", "b", "c"), class = c("ordered", 
        "factor")), Action = structure(c(21L, 30L, 19L, 27L, 16L, 17L, 
        18L, 13L, 12L, 3L, 1L, 8L, 4L, 21L, 20L, 27L, 15L, 17L, 18L, 
        14L, 26L, 2L, 8L, 5L, 22L, 26L, 2L, 8L, 5L, 22L, 22L, 11L, 7L, 
        24L, 29L, 6L, 23L, 25L, 25L, 10L, 28L, 9L), .Label = c("Add OA \"Act on Acceptance\" to websites", 
        "Add RDM liaison presece to divisional and departmental websites", 
        "All-staff message from VC and/or Pro-VC (Research)", "Arrange OA Briefing for every department", 
        "Arrange RDM Briefing for every department", "Brief Communication Officers Network", 
        "Brief Conference of Colleges", "Brief divisional board/commitees", 
        "Brief Faculty IT Officers", "Brief Research Committee", "Brief Senior Tutors", 
        "Brief/mobilise internal comms officers", "Brief/mobilise ORFN", 
        "Brief/mobilise Subject Librarians", "Ceate template slides for colleagues to use in delivering RDM Briefings", 
        "Create template slides for colleagues to use in delivering OA Briefings", 
        "Create template text & icon for use on websites", "Draft material for use in staff induction", 
        "Ensure webpages for ORA & Symplectic Elements  are updated & consistent", 
        "Ensure webpages for ORA-Data are updated & consistent", "Finalise key messages and draft campaign text", 
        "Inform colleges ", "Inform Heads of Departments and Research Directors", 
        "Present at Departmental Administrator's Meeting", "Present at HAF meeting", 
        "Present at UAS Conference", "Produce hard copy materials to promote message ", 
        "Update Divisional Board", "Update Library Committee (CLIPS)", 
        "Update OAO website content for HEFCE/REF"), class = "factor"), 
            Start.Date = structure(c(1435705200, 1435705200, 1438383600, 
            1441062000, 1441062000, 1441062000, 1441062000, 1444518000, 
            1444518000, 1425168000, 1420070400, 1444518000, 1444518000, 
            1441062000, 1441062000, 1441062000, 1441062000, 1441062000, 
            1441062000, 1438383600, 1441062000, 1420070400, 1444518000, 
            1444518000, 1443654000, 1441062000, 1420070400, 1444518000, 
            1444518000, 1443654000, 1441062000, 1444518000, 1449273600, 
            1444518000, 1444518000, 1445036400, 1441062000, 1443740400, 
            1443740400, 1443740400, 1447459200, 1443740400), class = c("POSIXct", 
            "POSIXt"), tzone = ""), End.Date = structure(c(1440975600, 
            1440975600, 1443567600, 1443567600, 1443567600, 1443567600, 
            1443567600, 1449273600, 1449273600, 1430348400, 1446249600, 
            1449273600, 1449273600, 1446249600, 1446249600, 1443567600, 
            1443567600, 1443567600, 1443567600, 1443567600, 1443567600, 
            1443567600, 1449014400, 1449014400, 1451520000, 1443567600, 
            1443567600, 1449014400, 1449014400, 1451520000, 1443567600, 
            1449014400, 1449619200, 1449014400, 1449014400, 1446249600, 
            1446249600, 1449792000, 1449792000, 1449792000, 1447804800, 
            1449792000), class = c("POSIXct", "POSIXt"), tzone = ""), 
            Comms.Type = structure(c(3L, 7L, 7L, 6L, 5L, 7L, 8L, 4L, 
            4L, 2L, 7L, 1L, 1L, 3L, 7L, 6L, 5L, 7L, 8L, 4L, 5L, 7L, 1L, 
            1L, 5L, 5L, 7L, 1L, 1L, 5L, 1L, 1L, 1L, 5L, 3L, 1L, 1L, 5L, 
            5L, 1L, 1L, 1L), .Label = c("Briefing", "Email", "Mixed Media", 
            "Mobilisation", "Presentations", "Printed Materials", "Website", 
            "Workshop"), class = "factor")), .Names = c("Source", "Action", 
        "Start.Date", "End.Date", "Comms.Type"), row.names = c(NA, -42L
        ), class = c("tbl_df", "tbl", "data.frame"))
        

3 个答案:

答案 0 :(得分:3)

我认为OP正在寻找:

enter image description here

我必须创建一个合成taskID来传递顺序(通过增加Start.Date,alpha by Action)。顺便说一句,如果您想按行按字母顺序排序,则需要更改因子的顺序或转换为字符。

# first let's order the DF the way we want it to appear 
#    (higher taskID's first)

# dplyr-free version
se.df$Action <- as.character(se.df$Action)  
se.df <- se.df[order(se.df$Start.Date, se.df$Action), ]
se.df$taskID <- as.factor(nrow(se.df):1)


library(ggplot2)
ggplot(se.df, aes(x = Start.Date, y=taskID, color = Comms.Type)) +
  scale_y_discrete(breaks=se.df$taskID, labels = se.df$Action) + 
  geom_segment(aes(xend = End.Date, y = taskID, yend = taskID), size = 5) +
  facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)

答案 1 :(得分:2)

按照您想要的顺序对数据框进行排序后,您应该能够将其用作因子的级别:

se.df <-se.df[order(se.df$Start.Date,se.df$Action),]
se.df$Action <- factor(se.df$Action, levels=unique(se.df$Action))
ggplot(se.df, aes(x = Start.Date, color = Comms.Type)) +
  geom_segment(aes(xend = End.Date, y = Action, yend = Action), size = 5) +
  facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)

enter image description here

答案 2 :(得分:0)

这个问题是在tidyverse和出色的forcats库问世之前的2015年提出的。

这是该问题的tidyverse解决方案:

使用arrangeStart.DateAction排序数据,然后使用task_id创建row_number()

library("tidyverse")
se.df <- se.df %>%
  arrange(desc(Start.Date), Action) %>%
  mutate(task_id = row_number())

使用fct_reorderAction转换为task_id排序的因数。

se.df <- se.df %>%
  mutate(
    Action = fct_reorder(Action, task_id),
    Action = fct_rev(Action)
  )

现在我们可以绘制此数据图表,而不必替换轴标签:

se.df %>%
  ggplot(aes(x = Start.Date, y = Action, color = Comms.Type)) +
  geom_segment(aes(xend = End.Date, y = Action, yend = Action), size = 5) +
  facet_grid(Source ~ ., scale = "free_y", space = "free_y", drop = TRUE)