我想在2个位置创建每个温度的时间图。我有从二月到四月的每天10分钟的温度数据,并且需要每天以小时为单位的平均温度图表进行绘制。
我计算了每天一小时的平均温度,并尝试使用不同的geom_plot和geopm_line方法创建一个图。
data <- read.xlsx("temperatura.xlsx", 1)
data <- data %>% mutate (month = as.factor(month), month = as.factor (month), day = as.factor(day), h = as.factor(h), min = as.factor(min))
head (data)
month day h min t.site1 t.site2
2 1 0 0 15.485 16.773
2 1 0 10 15.509 16.773
2 1 0 20 15.557 16.773
2 1 0 30 15.557 16.773
2 1 0 40 15.605 16.773
2 1 0 50 15.605 16.773
str(data)
'data.frame': 12816 obs. of 6 variables:
$ month : Factor w/ 3 levels "2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
$ day : Factor w/ 31 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
$ h : Factor w/ 24 levels "0","1","2","3",..: 1 1 1 1 1 1 2 2 2 2 ...
$ min : Factor w/ 6 levels "0","10","20",..: 1 2 3 4 5 6 1 2 3 4 ...
$ t.site1: num 15.5 15.5 15.6 15.6 15.6 ...
$ t.site2: num 16.8 16.8 16.8 16.8 16.8 ...
hour <- group_by(data, month, day, h)
mean.h.site1 <- summarize(hour, mean.h.site1 = mean(t.site1))
t1 <- ggplot (data = mean.h.site1, aes(x=h, y=mean.h.site1)) +
geom_line()
t2 <- ggplot(data = mean.h.site1, aes(x=h, y=mean.h.site1, group = month))+
geom_line() +
geom_point()
t3 <- ggplot (data = mean.h.site1, aes(x=day, y=mean.h.site1, group=1))+
geom_point()
我希望每个站点的温度随时间变化,但实际输出显示每天的温度变化。
答案 0 :(得分:0)
有趣的是,您的数据将月份,日期和小时显示为factor
。读取数据时,该列中的某处是否有某些字符值?看到数字以这种方式存储为因子是非常不寻常的。
我将做四件事:
根据实际日期绘制温度
# Load packages and data
library(data.table) # for overall fast data processing
library(lubridate) # for dates wrangling
library(ggplot2) # plotting
dt <- fread("month day h min t.site1 t.site2
2 1 0 0 15.485 16.773
2 1 0 10 15.509 16.773
2 1 0 20 15.557 16.773
2 1 0 30 15.557 16.773
2 1 0 40 15.605 16.773
2 1 0 50 15.605 16.773")
# Convert factors to numbers (I actuall didn't run this because I just created the data.table, but it seems you'll need to do it):
dt[, names(dt)[1:4] := lapply(.SD, function(x) as.numeric(as.character(x)), .SDcols = 1:4]
# Create proper dates. We'll consider all dates occurring in 2019.
dt[, date := ymd_hm(paste0("2019/", month, "/", day, " ", h, ":", min))]
# convert wide data to long one
dt2 <- melt(dt[, .(date, t.site1, t.site2)], id.vars = "date")
# plot the data
ggplot(dt2, aes(x = date, y = value, color = variable))+geom_point()+geom_path()
答案 1 :(得分:0)
我假设您需要在同一图中每天显示每小时温度变化的实际输出?
编辑: 我已经更新了代码以生成一天的数据。并且,还生成图表。
library(tidyverse)
library(lubridate)
df <- data_frame(month = rep(2, 144),
day = rep(1, 144),
h = rep(0:24, each = 6, len = 144),
min = rep((0:5)*10,24),
t.site1 = rnorm(n = 144, mean = 15.501, sd = 0.552),
t.site2 = rnorm(n = 144, mean = 16.501, sd = 0.532))
df %>%
group_by(month, day, h) %>%
summarise(mean_t_site1 = mean(t.site1), mean_t_site2 = mean(t.site2)) %>%
mutate(date = ymd_h(paste0("2019-",month,"-",day," ",h))) %>%
ungroup() %>%
select(mean_t_site1:date) %>%
gather(key = "site", value = "mean_temperature", -date) %>%
ggplot(aes(x = date, y = mean_temperature, colour = site)) +
geom_line()
答案 2 :(得分:0)
您可以将时间列粘贴在一起并进行as.POSIXct
转换。
@PavoDive 已经指出,我们将需要数字时间列。检查生成数据的代码或使用d[1:4] <- Map(function(x) as.numeric(as.character(x)), d[1:4])
转换为数字的代码。
现在将paste
中的行apply
转换为as.POSIXct
,然后cbind
转换为其余的行。 sprintf
首先看起来所有值在粘贴之前都具有相同的数字。
d2 <- cbind(time=as.POSIXct(apply(sapply(d[1:4], sprintf, fmt="%02d"), 1, paste, collapse=""),
format="%m%d%H%M"),
d[5:6])
很好地绘制,在这里以R为底
with(d2, plot(time, t.site1, ylim=c(15, 17), xaxt="n",
xlab="time", ylab="value", type="b", col="red",
main="Time series"))
with(d2, lines(time, t.site2, type="b", col="green"))
mtext(strftime(d2$time, "%H:%M"), 1, 1, at=d2$time) # strftime gives the desired formatting
legend("bottomright", names(d2)[2:3], col=c("red", "green"), lty=rep(1, 2))
数据
d <- structure(list(month = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "2", class = "factor"),
day = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "1", class = "factor"),
h = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "0", class = "factor"),
min = structure(1:6, .Label = c("0", "10", "20", "30", "40",
"50"), class = "factor"), t.site1 = c(15.485, 15.509, 15.557,
15.557, 15.605, 15.605), t.site2 = c(16.773, 16.773, 16.773,
16.773, 16.773, 16.773)), row.names = c(NA, -6L), class = "data.frame")