enter image description here我有一个庞大的数据集,每30秒就有一次数据。首先,我得到每小时数据的平均值,然后将其汇总为每日数据,再将其汇总为每月数据。我需要将mutate函数分配给名为mE_131的新数据集/变量。用于绘制月值。我是新手,请帮忙!
library(dplyr)
library(ggplot2)
attach(data)
data%>% #filtering 131 and 132
select(time,Column3,m_Pm) %>%
filter(data,Column3=="131")
filter(data,Column3=="132")
data_131<-filter(data,Column3=="131")
data_132<-filter(data,Column3=="132")
data_131%>%
mutate(datehour= format(time,"%Y-%m-%d %H"), date1= format(time,"%Y-%m-%d"), month=format(time,"%Y-%m")) %>%
group_by(datehour) %>% mutate(hourlyP=mean(m_Pm)) %>% distinct(datehour, .keep_all = TRUE) %>%
group_by(date1) %>% mutate(dailyP=sum(hourlyP)) %>% distinct(date1, .keep_all = TRUE) %>%
group_by(month) %>% summarise(monthlyP=sum(dailyP))
答案 0 :(得分:1)
如果您的目标是比较column3 == 131
和column3 == 132
之间的每月数据,则尽管我将在指南中向您展示如何做,但您不必为每个数据创建单独的数据集。结束。
首先,让我们为131
和132
创建所需的摘要:
data <- data %>%
filter(column3 == "131" | column3 == "132") %>% # filtering the required data only
mutate(datehour= format(time,"%Y-%m-%d %H"), # calculate the required stats
date1= format(time,"%Y-%m-%d"),
month=format(time,"%Y-%m")) %>%
group_by(datehour) %>%
mutate(hourlyP=mean(m_Pm)) %>%
distinct(datehour, .keep_all = TRUE) %>%
group_by(date1) %>%
mutate(dailyP=sum(hourlyP)) %>%
distinct(date1, .keep_all = TRUE) %>%
group_by(month) %>%
summarise(monthlyP=sum(dailyP))
注意:我已经在单独的行中编写了代码的每个部分,以提高可读性,但是它与上面显示的代码基本相同。
现在,让我们进行绘制:
data %>%
ggplot(aes(x=month, y=monthlyP, fill=column3)) +
geom_bar(position="dodge") # this will produce similar plot as in your example
如果您坚持为column3
中的每个值拥有单独的数据集,则可以简单地使用赋值运算符<-
来创建一个新的数据框,如下所示
mE_131 <- data_131 %>%
mutate(datehour= format(time,"%Y-%m-%d %H"),
date1= format(time,"%Y-%m-%d"),
month=format(time,"%Y-%m")) %>%
group_by(datehour) %>%
mutate(hourlyP=mean(m_Pm)) %>%
distinct(datehour, .keep_all = TRUE) %>%
group_by(date1) %>%
mutate(dailyP=sum(hourlyP)) %>%
distinct(date1, .keep_all = TRUE) %>%
group_by(month) %>%
summarise(monthlyP=sum(dailyP))
然后执行相同的操作来创建mE_132
。但是,我不建议这样做,因为绘制它们会比较困难。