我从提供的测试代码中得到了这个图:
我希望明智地组织x刻度(使用原始数据创建的绘图上的图像进一步突出显示问题)。
以下是一些可用作示例的代码:
## Create some numbers for testing
set.seed(123)
Aboard <- sample(1:50,50)
## some years to use
Years <- c(1931, 1931, 1931, 1934, 1934, 1934, 1934, 1937, 1937, 1937, 1937, 1937, 1938, 1943, 1943, 1943, 1943, 1943, 1955, 1955, 1955, 1955, 1955, 1961, 1961, 1961, 1970, 1970, 1970, 1970, 1973, 1973, 1973, 1978, 1980, 1980, 1982, 1982, 1983, 1984, 1984, 1985, 1986, 1986, 1986, 1987, 1987, 1989, 1990, 1990)
df <- data.frame(Aboard, Years)
###############################################################################
## I WANT TO FIND THE SUM OF FOR EACH YEAR
## change years to factor variable, so that I have levels to work with.
df$Years <- factor(df$Years)
## blank vector to store sum values.
aboardYearTotal= c()
## iterate over the levels of the years vector.
for(y in levels(as.factor(df$Years))){
## I want to use an integer rather than a string
y = as.numeric(y)
## for each level - find the sum of all Aboard values that correspond with it.
## I need to remove NA values as there are some.
yy=sum(df$Aboard[df$Years==y], na.rm = TRUE)
aboardYearTotal = c(aboardYearTotal, yy)
}
## I no longer need y, or yy
rm(y)
rm(yy)
###############################################################################
## Create plot using this variable
yearLevels <- levels(as.factor(df$Years))
aboardYears <- data.frame(yearLevels, aboardYearTotal)
## Create a plot of the data for total number aboard each year
p <- ggplot(aboardYears, aes(yearLevels, aboardYearTotal))
p + geom_point(aes(size = aboardYearTotal))
如何控制x轴上的刻度线?
我试图和scale_x_continuous
和scale_x_discrete
一起玩,但是我
无法按预期工作。
例如,如果我的起始值为0且结束值为10,则间距为2,I 将x轴标记为:
0 2 4 6 8 10
这是原始图,突出了我对x轴的问题:
我愿意接受有关更好的做法的建议或建议。
答案 0 :(得分:3)
不要将Year
转换为一个因素。相反,请将其保留为数字并使用stat_summary
来处理总和。
df <- data.frame(Aboard, Years)
ggplot(df, aes(Years, Aboard)) +
stat_summary(fun.y=sum, geom="point", aes(size=..y..))
ggplot
将为x轴标签选择合理的默认值,但您也可以更改这些默认值。例如:
ggplot(df, aes(Years, Aboard)) +
stat_summary(fun.y=sum, geom="point", aes(size=..y..)) +
scale_x_continuous(breaks=seq(1920, 2020, 20))
您可以通过提供这些值的矢量,将x轴断点设置为您想要的任何值。例如:
scale_x_continuous(breaks=seq(min(df$Years), max(df$Years)+6, 6))
或
scale_x_continuous(breaks=c(1931, 1955))
有时,您需要或想要在ggplot之外执行数据汇总操作。有很多选择。这是一对夫妇:
基础R
df.summary = aggregate(Aboard ~ Years, df, sum)
<强> tidyverse
强>
library(tidyverse)
df.summary = df %>%
group_by(Years) %>%
summarise(Aboard = sum(Aboard))
您甚至可以在绘制数据时动态执行此操作,而无需创建单独的摘要数据框。例如:
ggplot(aggregate(Aboard ~ Years, df, sum), aes(Years, Aboard, size=Aboard)) +
geom_point()
或
df %>%
group_by(Years) %>%
summarise(Aboard = sum(Aboard)) %>%
ggplot(aes(Years, Aboard, size=Aboard)) +
geom_point()