创建时间序列图并将数值数据转换为日期

时间:2019-06-18 02:01:13

标签: r

我想在2个位置创建每个温度的时间图。我有从二月到四月的每天10分钟的温度数据,并且需要每天以小时为单位的平均温度图表进行绘制。

我计算了每天一小时的平均温度,并尝试使用不同的geom_plot和geopm_line方法创建一个图。


data <- read.xlsx("temperatura.xlsx", 1)
data <- data %>% mutate (month = as.factor(month), month = as.factor (month), day = as.factor(day), h = as.factor(h), min = as.factor(min))

head (data)
month day h min  t.site1 t.site2
  2   1   0   0  15.485  16.773
  2   1   0  10  15.509  16.773
  2   1   0  20  15.557  16.773
  2   1   0  30  15.557  16.773
  2   1   0  40  15.605  16.773
  2   1   0  50  15.605  16.773


str(data)
'data.frame':   12816 obs. of  6 variables:
 $ month  : Factor w/ 3 levels "2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
 $ day    : Factor w/ 31 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ h      : Factor w/ 24 levels "0","1","2","3",..: 1 1 1 1 1 1 2 2 2 2 ...
 $ min    : Factor w/ 6 levels "0","10","20",..: 1 2 3 4 5 6 1 2 3 4 ...
 $ t.site1: num  15.5 15.5 15.6 15.6 15.6 ...
 $ t.site2: num  16.8 16.8 16.8 16.8 16.8 ...


hour <- group_by(data, month, day, h) 

mean.h.site1 <- summarize(hour, mean.h.site1 = mean(t.site1))

t1 <- ggplot (data = mean.h.site1, aes(x=h, y=mean.h.site1)) +
  geom_line()

t2 <- ggplot(data = mean.h.site1, aes(x=h, y=mean.h.site1, group = month))+
  geom_line() +
  geom_point()

t3 <- ggplot (data = mean.h.site1, aes(x=day, y=mean.h.site1, group=1))+
  geom_point()


我希望每个站点的温度随时间变化,但实际输出显示每天的温度变化。

3 个答案:

答案 0 :(得分:0)

有趣的是,您的数据将月份,日期和小时显示为factor。读取数据时,该列中的某处是否有某些字符值?看到数字以这种方式存储为因子是非常不寻常的。

我将做四件事:

  1. 将因子转换为数字
  2. 将数字转换为日期
  3. 将一张宽桌子转换成一张长桌子,最后
  4. 根据实际日期绘制温度

    # Load packages and data
    library(data.table) # for overall fast data processing
    library(lubridate) # for dates wrangling
    library(ggplot2) # plotting
    
    dt <- fread("month day h min  t.site1 t.site2
      2   1   0   0  15.485  16.773
      2   1   0  10  15.509  16.773
      2   1   0  20  15.557  16.773
      2   1   0  30  15.557  16.773
      2   1   0  40  15.605  16.773
      2   1   0  50  15.605  16.773")
    
    # Convert factors to numbers (I actuall didn't run this because I just created the data.table, but it seems you'll need to do it):
    
    dt[, names(dt)[1:4] := lapply(.SD, function(x) as.numeric(as.character(x)), .SDcols = 1:4]
    
    # Create proper dates. We'll consider all dates occurring in 2019.
    dt[, date := ymd_hm(paste0("2019/", month, "/", day, " ", h, ":", min))]
    
    # convert wide data to long one
    dt2 <- melt(dt[, .(date, t.site1, t.site2)], id.vars = "date")
    
    # plot the data
    ggplot(dt2, aes(x = date, y = value, color = variable))+geom_point()+geom_path()
    

Resulting plot

答案 1 :(得分:0)

我假设您需要在同一图中每天显示每小时温度变化的实际输出?

编辑: 我已经更新了代码以生成一天的数据。并且,还生成图表。

library(tidyverse)
library(lubridate)

df <- data_frame(month = rep(2, 144), 
                 day = rep(1, 144),
                 h = rep(0:24, each = 6, len = 144),
                 min = rep((0:5)*10,24),
                 t.site1 = rnorm(n = 144, mean = 15.501, sd = 0.552),
                 t.site2 = rnorm(n = 144, mean = 16.501, sd = 0.532))

df %>%
        group_by(month, day, h) %>%
        summarise(mean_t_site1 = mean(t.site1), mean_t_site2 = mean(t.site2)) %>%
        mutate(date = ymd_h(paste0("2019-",month,"-",day," ",h))) %>%
        ungroup() %>%
        select(mean_t_site1:date) %>%
        gather(key = "site", value = "mean_temperature", -date) %>%
        ggplot(aes(x = date, y = mean_temperature, colour = site)) +
        geom_line()

您能否验证这是否是您需要的输出? enter image description here

答案 2 :(得分:0)

您可以将时间列粘贴在一起并进行as.POSIXct转换。

@PavoDive 已经指出,我们将需要数字时间列。检查生成数据的代码或使用d[1:4] <- Map(function(x) as.numeric(as.character(x)), d[1:4])转换为数字的代码。

现在将paste中的行apply转换为as.POSIXct,然后cbind转换为其余的行。 sprintf首先看起来所有值在粘贴之前都具有相同的数字。

d2 <- cbind(time=as.POSIXct(apply(sapply(d[1:4], sprintf, fmt="%02d"), 1, paste, collapse=""), 
                  format="%m%d%H%M"), 
            d[5:6])

很好地绘制,在这里以R为底

with(d2, plot(time, t.site1, ylim=c(15, 17), xaxt="n",
              xlab="time", ylab="value", type="b", col="red",
              main="Time series"))
with(d2, lines(time, t.site2, type="b", col="green"))
mtext(strftime(d2$time, "%H:%M"), 1, 1, at=d2$time)  # strftime gives the desired formatting
legend("bottomright", names(d2)[2:3], col=c("red", "green"), lty=rep(1, 2))

enter image description here

数据

d <- structure(list(month = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "2", class = "factor"), 
    day = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "1", class = "factor"), 
    h = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "0", class = "factor"), 
    min = structure(1:6, .Label = c("0", "10", "20", "30", "40", 
    "50"), class = "factor"), t.site1 = c(15.485, 15.509, 15.557, 
    15.557, 15.605, 15.605), t.site2 = c(16.773, 16.773, 16.773, 
    16.773, 16.773, 16.773)), row.names = c(NA, -6L), class = "data.frame")