如何让ggplot2轴在变量值处中断?

时间:2018-05-25 16:15:36

标签: r ggplot2

这个@camille代码用ggplot生成一个很好的帕累托图。

library(tidyverse)

d <- tribble(
    ~ category, ~defect,
    "price", 80,
    "schedule", 27,
    "supplier", 66,
    "contact", 94,
    "item", 33
) %>% arrange(desc(defect)) %>%
    mutate(
        cumsum = cumsum(defect),
        freq = round(defect / sum(defect), 3),
        cum_freq = cumsum(freq)
    ) %>%
    mutate(category = as.factor(category) %>% fct_reorder(defect))

brks <- unique(d$cumsum)

ggplot(d, aes(x = fct_rev(category))) +
    geom_col(aes(y = defect)) +
    geom_point(aes(y = cumsum)) +
    geom_line(aes(y = cumsum, group = 1)) +
    scale_y_continuous(sec.axis = sec_axis(~. / max(d$cumsum), labels = scales::percent), breaks = brks)

Capture3.png

它几乎是完美的,除了我希望看到第二个y轴在累积的y值处断裂。这可以使用以下代码在base-R中实现。但是我如何在ggplot中做到这一点?

## Creating the d tribble
library(tidyverse)
d <- tribble(
  ~ category, ~defect,
  "price", 80,
  "schedule", 27,
  "supplier", 66,
  "contact", 94,
  "item", 33
)

## Creating new columns
d <- arrange(d, desc(defect)) %>%
  mutate(
    cumsum = cumsum(defect),
    freq = round(defect / sum(defect), 3),
    cum_freq = cumsum(freq)
  )

## Saving Parameters 
def_par <- par() 

## New margins
par(mar=c(5,5,4,5)) 

## bar plot, pc will hold x values for bars
pc = barplot(d$defect,  
             width = 1, space = 0.2, border = NA, axes = F,
             ylim = c(0, 1.05 * max(d$cumsum, na.rm = T)), 
             ylab = "Cummulative Counts" , cex.names = 0.7, 
             names.arg = d$category,
             main = "Pareto Chart (version 1)")

## Cumulative counts line 
lines(pc, d$cumsum, type = "b", cex = 0.7, pch = 19, col="cyan4")

## Framing plot
box(col = "grey62")

## adding axes
axis(side = 2, at = c(0, d$cumsum), las = 1, col.axis = "grey62", col = "grey62", cex.axis = 0.8)
axis(side = 4, at = c(0, d$cumsum), labels = paste(c(0, round(d$cum_freq * 100)) ,"%",sep=""), 
     las = 1, col.axis = "cyan4", col = "cyan4", cex.axis = 0.8)

## restoring default paramenter
par(def_par) 

Capture4.png

Camille有一些想法,但它们仍然存在,&#34;更新版本的ggplot2允许辅助轴,但它需要基于主轴的转换。在这种情况下,这意味着它应该采用主轴的值并除以最大值以获得百分比。&#34;。

2 个答案:

答案 0 :(得分:4)

brks <- unique(d$cumsum)
brks2 <- unique(d$cumsum / max(d$cumsum))

ggplot(d, aes(x = fct_rev(category))) +
  geom_col(aes(y = defect)) +
  geom_point(aes(y = cumsum)) +
  geom_line(aes(y = cumsum, group = 1)) +
  scale_y_continuous(sec.axis = sec_axis(~. / max(d$cumsum), labels = scales::percent, breaks = brks2), breaks = brks)

enter image description here

答案 1 :(得分:2)

这是对我上一个问题的上一个代码的唯一改进,而@Jack Brookes的答案是我消除了在ggplot调用之外计算两组中断的需要。相反,我只是将累积原始数字的中断设为unique(d$cumsum),将累积频率的中断设为unique(d$cumfreq)。在这两个方面,我在开头添加0,因为否则在0处没有中断。


library(tidyverse)
library(scales)

d <- tribble(
  ~ category, ~defect,
  "price", 80,
  "schedule", 27,
  "supplier", 66,
  "contact", 94,
  "item", 33
) %>% arrange(desc(defect)) %>%
  mutate(
    cumsum = cumsum(defect),
    freq = round(defect / sum(defect), 3),
    cum_freq = cumsum(freq)
  ) %>%
  mutate(category = as.factor(category) %>% fct_reorder(defect))

ggplot(d, aes(x = fct_rev(category))) +
  geom_col(aes(y = defect)) +
  geom_point(aes(y = cumsum)) +
  geom_line(aes(y = cumsum, group = 1)) +
  scale_y_continuous(breaks = c(0, unique(d$cumsum)),
    sec.axis = sec_axis(~. / max(d$cumsum), labels = scales::percent,
       breaks = c(0, unique(d$cum_freq))) 
  ) +
  theme(panel.grid.minor = element_blank())