如何在R中将mutate和distance分配给另一个变量?

时间:2018-09-11 05:43:00

标签: r dplyr

enter image description here我有一个庞大的数据集,每30秒就有一次数据。首先,我得到每小时数据的平均值,然后将其汇总为每日数据,再将其汇总为每月数据。我需要将mutate函数分配给名为mE_131的新数据集/变量。用于绘制月值。我是新手,请帮忙!

library(dplyr)
library(ggplot2)

attach(data)
data%>% #filtering 131 and 132
  select(time,Column3,m_Pm) %>%
  filter(data,Column3=="131") 
filter(data,Column3=="132")
data_131<-filter(data,Column3=="131") 
data_132<-filter(data,Column3=="132") 

data_131%>%
  mutate(datehour= format(time,"%Y-%m-%d %H"), date1= format(time,"%Y-%m-%d"), month=format(time,"%Y-%m")) %>% 
  group_by(datehour) %>% mutate(hourlyP=mean(m_Pm)) %>% distinct(datehour, .keep_all = TRUE) %>% 
  group_by(date1) %>% mutate(dailyP=sum(hourlyP)) %>% distinct(date1, .keep_all = TRUE) %>% 
  group_by(month) %>% summarise(monthlyP=sum(dailyP))

1 个答案:

答案 0 :(得分:1)

如果您的目标是比较column3 == 131column3 == 132之间的每月数据,则尽管我将在指南中向您展示如何做,但您不必为每个数据创建单独的数据集。结束。

首先,让我们为131132创建所需的摘要:

data <- data %>%
    filter(column3 == "131" | column3 == "132") %>% # filtering the required data only
    mutate(datehour= format(time,"%Y-%m-%d %H"), # calculate the required stats
          date1= format(time,"%Y-%m-%d"),
          month=format(time,"%Y-%m")) %>% 
   group_by(datehour) %>%
   mutate(hourlyP=mean(m_Pm)) %>%
   distinct(datehour, .keep_all = TRUE) %>% 
   group_by(date1) %>%
   mutate(dailyP=sum(hourlyP)) %>%
   distinct(date1, .keep_all = TRUE) %>%
   group_by(month) %>% 
   summarise(monthlyP=sum(dailyP))

注意:我已经在单独的行中编写了代码的每个部分,以提高可读性,但是它与上面显示的代码基本相同。

现在,让我们进行绘制:

data %>%
    ggplot(aes(x=month, y=monthlyP, fill=column3)) +
    geom_bar(position="dodge") # this will produce similar plot as in your example

如果您坚持为column3中的每个值拥有单独的数据集,则可以简单地使用赋值运算符<-来创建一个新的数据框,如下所示

mE_131 <- data_131 %>%
   mutate(datehour= format(time,"%Y-%m-%d %H"), 
          date1= format(time,"%Y-%m-%d"),
          month=format(time,"%Y-%m")) %>% 
   group_by(datehour) %>%
   mutate(hourlyP=mean(m_Pm)) %>%
   distinct(datehour, .keep_all = TRUE) %>% 
   group_by(date1) %>%
   mutate(dailyP=sum(hourlyP)) %>%
   distinct(date1, .keep_all = TRUE) %>%
   group_by(month) %>% 
   summarise(monthlyP=sum(dailyP))

然后执行相同的操作来创建mE_132。但是,我不建议这样做,因为绘制它们会比较困难。

相关问题