Question

我对某些代码感到有些困惑。当然，我会欣赏一段能够解决我的困境的代码，但我也很感激如何解决这个问题。

这里是：首先，我安装了包（ggplot2，lubridate和openxlsx）

相关部分：我从意大利天然气TSO网站上提取文件：

Storico_G1 <- read.xlsx(xlsxFile = "http://www.snamretegas.it/repository/file/Info-storiche-qta-gas-trasportato/dati_operativi/2017/DatiOperativi_2017-IT.xlsx",sheet = "Storico_G+1", startRow = 1, colNames = TRUE)

然后我创建了一个包含我想要保留的变量的数据框：

Storico_G1_df <- data.frame(Storico_G1$pubblicazione, Storico_G1$IMMESSO, Storico_G1$`SBILANCIAMENTO.ATTESO.DEL.SISTEMA.(SAS)`)

然后更改时间格式：

Storico_G1_df$pubblicazione   <- ymd_h(Storico_G1_df$Storico_G1.pubblicazione)

现在斗争开始了。因为在这个例子中我想用2个不同的Y轴绘制2个时间序列，因为范围非常不同。这不是一个真正的问题，因为有了融合功能和ggplot，我可以实现这一点。但是，由于1列中有NAs，我不知道如何解决这个问题。因为，在不完整的（SAS）专栏中，我主要关注16:00的数据点，理想情况下，我会在一个图表上每小时绘制一个图表，在第二个图表上每天只有1个数据点（在所说的16:00）。我附上了一张与图表风格无关的示例图片。但是，在附图中，我在两个图表上都有相同的数据点，因此它工作正常。

感谢任何提示。

小心

Answer 1

library(lubridate)
library(ggplot2)
library(openxlsx)
library(dplyr)

#Use na.strings it looks like NAs can have many values in the dataset
storico.xl <- read.xlsx(xlsxFile = "http://www.snamretegas.it/repository/file/Info-storiche-qta-gas-trasportato/dati_operativi/2017/DatiOperativi_2017-IT.xlsx",
                        sheet = "Storico_G+1", startRow = 1,
                        colNames = TRUE,
                        na.strings = c("NA","N.D.","N.D"))

#Select and rename the crazy column names
storico.g1 <- data.frame(storico.xl) %>% 
   select(pubblicazione, IMMESSO, SBILANCIAMENTO.ATTESO.DEL.SISTEMA..SAS.)
names(storico.g1) <- c("date_hour","immesso","sads")


# the date column look is in the format ymd_h
storico.g1 <- storico.g1 %>% mutate(date_hour = ymd_h(date_hour))


#Not sure exactly what you want to plot, but here is each point by hour
ggplot(storico.g1, aes(x= date_hour, y = immesso)) + geom_line()

#For each day you can group, need to format the date_hour for a day
#You can check there are 24 points per day
#feed the new columns into the gplot

storico.g1 %>% 
  group_by(date = as.Date(date_hour, "d-%B-%y-")) %>%
  summarise(count = n(),
            daily.immesso = sum(immesso)) %>%
  ggplot(aes(x = date, y = daily.immesso)) + geom_line()

在R Studio中仅绘制1小时数据点（每天1个）和小时点数（每天24个）

1 个答案: