在ggplot 2上合理地组织x刻度

时间:2017-11-11 16:34:55

标签: r ggplot2

我从提供的测试代码中得到了这个图:

enter image description here

我希望明智地组织x刻度(使用原始数据创建的绘图上的图像进一步突出显示问题)。

以下是一些可用作示例的代码:

## Create some numbers for testing

set.seed(123)
Aboard <- sample(1:50,50)

## some years to use

Years <- c(1931, 1931, 1931, 1934, 1934, 1934, 1934, 1937, 1937, 1937, 1937, 1937, 1938, 1943, 1943, 1943, 1943, 1943, 1955, 1955, 1955, 1955, 1955, 1961, 1961, 1961, 1970, 1970, 1970, 1970, 1973, 1973, 1973, 1978, 1980, 1980, 1982, 1982, 1983, 1984, 1984, 1985, 1986, 1986, 1986, 1987, 1987, 1989, 1990, 1990)

df <- data.frame(Aboard, Years)

###############################################################################

## I WANT TO FIND THE SUM OF FOR EACH YEAR

## change years to factor variable, so that I have levels to work with.
df$Years <- factor(df$Years)

## blank vector to store sum values.
aboardYearTotal= c()


## iterate over the levels of the years vector.
for(y in levels(as.factor(df$Years))){
  ## I want to use an integer rather than a string
  y = as.numeric(y)
  ## for each level - find the sum of all Aboard values that correspond with it.
  ## I need to remove NA values as there are some.
  yy=sum(df$Aboard[df$Years==y], na.rm = TRUE)
  aboardYearTotal = c(aboardYearTotal, yy)
}

## I no longer need y, or yy
rm(y)
rm(yy)

###############################################################################

## Create plot using this variable

yearLevels <- levels(as.factor(df$Years))
aboardYears <- data.frame(yearLevels, aboardYearTotal)

## Create a plot of the data for total number aboard each year
p <- ggplot(aboardYears, aes(yearLevels, aboardYearTotal))
p + geom_point(aes(size = aboardYearTotal))

如何控制x轴上的刻度线?

我试图和scale_x_continuousscale_x_discrete一起玩,但是我 无法按预期工作。

理想情况下,我可以选择

  • x
  • 的起始值
  • x
  • 的结束值
  • 这些终点之间的间距

例如,如果我的起始值为0且结束值为10,则间距为2,I 将x轴标记为:

0 2 4 6 8 10

这是原始图,突出了我对x轴的问题:

enter image description here

我愿意接受有关更好的做法的建议或建议。

1 个答案:

答案 0 :(得分:3)

不要将Year转换为一个因素。相反,请将其保留为数字并使用stat_summary来处理总和。

df <- data.frame(Aboard, Years)

ggplot(df, aes(Years, Aboard)) +
  stat_summary(fun.y=sum, geom="point", aes(size=..y..))

ggplot将为x轴标签选择合理的默认值,但您也可以更改这些默认值。例如:

ggplot(df, aes(Years, Aboard)) +
  stat_summary(fun.y=sum, geom="point", aes(size=..y..)) +
  scale_x_continuous(breaks=seq(1920, 2020, 20))

enter image description here

您可以通过提供这些值的矢量,将x轴断点设置为您想要的任何值。例如:

scale_x_continuous(breaks=seq(min(df$Years), max(df$Years)+6, 6))

scale_x_continuous(breaks=c(1931, 1955))

有时,您需要或想要在ggplot之外执行数据汇总操作。有很多选择。这是一对夫妇:

基础R

df.summary = aggregate(Aboard ~ Years, df, sum)

<强> tidyverse

library(tidyverse)

df.summary = df %>%
  group_by(Years) %>% 
  summarise(Aboard = sum(Aboard))

您甚至可以在绘制数据时动态执行此操作,而无需创建单独的摘要数据框。例如:

ggplot(aggregate(Aboard ~ Years, df, sum), aes(Years, Aboard, size=Aboard)) +
  geom_point()

df %>%
  group_by(Years) %>% 
  summarise(Aboard = sum(Aboard)) %>% 
  ggplot(aes(Years, Aboard, size=Aboard)) +
    geom_point()