在ggplot中重新排序数据(成功重新排序基础数据后)

时间:2016-02-10 20:06:14

标签: r ggplot2

helping me out根据特定订单对数据框进行排序后,我认为我可以使用scale_x_discrete() - 参数标记我的绘图轴,我在其中定义了相同的顺序以适合数据和标签。 尽管标签是按正确的顺序创建的,但似乎ggplot自己对数据集进行排序,这意味着条形图不适合标签。
正如您在屏幕截图中看到的那样,条形图以相同的顺序可视化(一个ggplot和一个不使用scale_x_discrete(limits = orderSort) .... 有没有办法压制内部订单并应用订单,这应该是new.df $ UserEmail?

# Load packages
library(plyr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(reshape2)

# Load data
RawDataSet <- read.csv("http://pastebin.com/raw/VP6cF31A", sep=";")

# Summarising the data
new.df <- RawDataSet %>% 
  group_by(UserEmail,location,context) %>% 
  tally() %>%
  mutate(n2 = n * c(1,-1)[(location=="NOT_WITHIN")+1L]) %>%
  group_by(UserEmail,location) %>%
  mutate(p = c(1,-1)[(location=="NOT_WITHIN")+1L] * n/sum(n))

# Reorder new.df based on a defined verctor
new.df <- new.df[ order(match(new.df$UserEmail, as.integer(c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")) )), ]

# Same vector which is used to sort new.df
orderSort <- c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")

ggplot() +
  geom_bar(data = new.df[new.df$location == "NOT_WITHIN",],
           aes(x = UserEmail, y = n2, color = "darkgreen", fill = context),
           size = 1, stat = "identity", width = 0.7) +
  geom_bar(data = new.df[new.df$location == "WITHIN",],
           aes(x = UserEmail, y = n2, color = "darkred", fill = context),
           size = 1, stat = "identity", width = 0.7) +
  # Labels are created in the right order, but geom_bars are not sorted
  # scale_x_discrete(limits = orderSort) +
  scale_y_continuous(breaks = seq(-25,25,5),
                     labels = c(25,20,15,10,5,0,5,10,15,20,25)) +
  scale_color_manual("Location of interaction",
                     values = c("darkgreen","darkred"),
                     labels = c("NOT_WITHIN","WITHIN")) +
  scale_fill_manual("Type of interaction",
                    values = c("lightyellow","lightblue"),
                    labels = c("Clicked A","Clicked B")) +
  guides(color = guide_legend(override.aes = list(color = c("darkred","darkgreen"),
                                                  fill = NA, size = 2), reverse = TRUE),
         fill = guide_legend(override.aes = list(fill = c("lightyellow","lightblue"),
                                                 color = "black", size = 0.5))) +
  coord_flip() +
  theme_grey() +
  theme(
    axis.text.x = element_text(angle = 0, hjust = 1, vjust = 0.5, size = 14),
    axis.title = element_blank(),
    legend.title = element_text(face = "italic", size = 14),
    legend.key.size = unit(1, "lines"),
    legend.text = element_text(size = 11))

不使用scale_x_discrete - 参数。
Without Label
使用scale_x_discrete - 参数。 with Label

1 个答案:

答案 0 :(得分:2)

更新:诀窍是将UserEmail变量转换为系数变量:

# converting 'UserEmail' to a factor variable
new.df$UserEmail <- factor(as.character(new.df$UserEmail),
                           levels = unique(new.df$UserEmail))


# and use:
scale_x_discrete(limits = orderSort)

这导致以下情节:

enter image description here

OLD ANSWER:如果我理解正确,您应该定义中断而不是定义限制。使用:

scale_x_discrete(breaks = orderSort, limits = sort(unique(new.df2$UserEmail)))
# or:
scale_x_discrete(breaks = orderSort, limits = as.integer(orderSort))

给出:

enter image description here