使用润滑或刻度绘制多个数据集

时间:2019-01-29 05:32:27

标签: r scale lubridate

我已经创建了按年份过滤的三个数据集(banks2016banks2017banks2018。我已经用这三个数据集制作了一个图,因此,三条不同颜色的线

我的问题是,鉴于有每周的交易,在该特定月份的每个月我有四个要显示的积分。例如,如果我已付款1-1-168-1-1615-1-1622-1-16,它们都显示在1月的行中。理想情况下,我希望线和点在1月和2月之间。

我尝试了几种不同的方法,包括使用scales软件包date_breaks。我尝试更改使用lubridate的方式,但无济于事。有什么建议吗?

下面是我的代码。

ggplot(rbind(banks2016,banks2017,banks2018), 
       aes(month(Date, label=TRUE, abbr=TRUE), Balance, 
       group = factor(year(Date)), colour=factor(year(Date)))) +  
  geom_line() +
  geom_point() +
  labs(x="Month", colour="Year") +
  theme_classic()

和银行2016的支出。我想根据日期绘制总余额。因此,一个连续的行每周运行一次,而x实验室则是一个月。现在仔细查看数据,日期并不总是像我最初想象的那样每周一次。我可能需要重新整理数据。

structure(list(Date = structure(c(17038,17038,17038,17031, 17029、17024、17022、17017、17017、17014、17009、17008、16996, 16989、16989、16987、16987、16987、16983),类别=“日期”),借方= c(NA, NA,1686451.25,NA,NA,3111755.91,NA,NA,25100、3.66,NA, NA,313.26,NA,1566.27,NA,NA,NA,0.8),积分= c(14693.48, 10250,NA,409.25,5655863.07,NA,2304.45,2443,NA,NA,300, 122,NA,8716.45,NA,30000,25000,5993.6,NA),余额= c(15824841.24, 15810147.76、15799897.76、17486349.01、17485939.76、11830076.69, 14941832.6、14939528.15、14937085.15、14962185.15、14962188.81, 14961888.81、14961766.81、14962080.07、14953363.62、14954929.89, 14924929.89,14899929.89,14893936.29)),row.names = c(NA,-19L ),class =“ data.frame”)

1 个答案:

答案 0 :(得分:0)

听起来您想让x轴显示1月-12月,并且每行显示单独日历年的时间余额;那正确吗?如果是这样,一种技术(在this excellent answer中进行了描述)是创建一个新的日期列,该列将所有日期都放在同一年中,并绘制出来,但按实际日期中的年份分组。这就是您的数据集的外观:

library(ggplot2)
library(lubridate)
library(dplyr)

# Posted dataset.
banks = structure(list(Date = structure(c(17038, 17038, 17038, 17031, 17029, 17024, 17022, 17017, 17017, 17014, 17009, 17008, 16996, 16989, 16989, 16987, 16987, 16987, 16983), class = "Date"), Debits = c(NA, NA, 1686451.25, NA, NA, 3111755.91, NA, NA, 25100, 3.66, NA, NA, 313.26, NA, 1566.27, NA, NA, NA, 0.8), Credits = c(14693.48, 10250, NA, 409.25, 5655863.07, NA, 2304.45, 2443, NA, NA, 300, 122, NA, 8716.45, NA, 30000, 25000, 5993.6, NA), Balance = c(15824841.24, 15810147.76, 15799897.76, 17486349.01, 17485939.76, 11830076.69, 14941832.6, 14939528.15, 14937085.15, 14962185.15, 14962188.81, 14961888.81, 14961766.81, 14962080.07, 14953363.62, 14954929.89, 14924929.89, 14899929.89, 14893936.29)), row.names = c(NA, -19L ), class = "data.frame")
# The posted dataset is for only one year (2016).  Duplicate it for 2017 and
# 2018, and change the balances a bit, so we can see the grouping.
banks = bind_rows(
  banks,
  banks %>%
    mutate(Date = Date + years(1),
           Balance = Balance * 1.1),
  banks %>%
    mutate(Date = Date + years(2),
           Balance = Balance * 1.2)
)

# Add a utility "date for plotting" field that puts all the dates in the year
# 2000.
banks = banks %>%
  mutate(DateToPlot = Date - years(year(Date) - 2000))

# Plot Balance as a function of DateToPlot.  Group/color by year.  Make the
# x-axis labels look pretty.
ggplot(banks, 
       aes(x = DateToPlot, y = Balance,
           group = factor(year(Date)), colour=factor(year(Date)))) +  
  geom_line() +
  geom_point() +
  scale_x_date(date_breaks = "1 month",
               date_labels = "%B") +
  labs(x="Month", colour="Year") +
  theme_classic()