麻烦把年份变成单独的对象

时间:2017-04-22 21:47:43

标签: r regex lubridate

嘿大家我知道我以前见过这样的帖子,但由于某种原因,我所尝试的建议都没有奏效。基本上我要做的是从一个名为" Production.Period.End.Date"的变量中取日期。格式为dd / mm / yyyy并将这些日期的每个部分转换为单独的对象进行分析。我这样做的原因是采取年度平均千瓦生产标签" Period_kWh_Production"并跟踪加班的变化。如果有帮助的话,我粘贴了我到目前为止的代码。

setwd(" C:\用户\ fredd \收存箱\ Grad_Life \ Spring_2017 \ AFM \ Final_Paper \&#34)

KWTProd.df = read.csv("Merge1//Kwht_Production_07-15.csv", header=T)

##Did this to verify "Production.Period.End.Date"

names(KWTProd.df)

##
names(KWTProd.df)
[1] "Application.Number"                     
[2] "Program.Administrator"                  
[3] "Program"                                
[4] "Total.Cost"                             
[5] "System.Owner.Sector"                    
[6] "Host.Customer.Sector"                   
[7] "Host.Customer.Physical.Address.City"    
[8] "Host.Customer.Physical.Address.County"  
[9] "Host.Customer.Physical.Address.Zip.Code"
[10] "PBI.Payment.."                          
[11] "Production.Period.End.Date"             
[12] "Period_kWh_Production" <-IT EXISTS ##
##

##Did this to plot changes of Period_kWh_Production over time##

plot(Period_kWh_Production ~ Production.Period.End.Date, data = KWTProd.df)

##Tried to do this to aggregate data in average##

aggregate(Period_kWh_Production~Production.Period.End.Date,KWTProd.df,mean)

##Still too noisy and can't find the mean by year :C##

as.date(Production.Period.End.Date, data = KWTProd.df)

##Says "Production.Period.End.Date" Not found BUT IT EXISTS##

##Tried this to group and summarise by year but it says: Error in     UseMethod("mutate_") : 
no applicable method for 'mutate_' applied to an object of class "function"         ## 

summary <- df %>%
  mutate(dates = dmy(Production.Period.End.Date),
         year  = year(Production.Period.End.Date)) %>%
  group_by(year) %>%
  summarise(mean = mean(x, na.rm = TRUE),
            sd   = sd(x, na.rm = TRUE))

##Trying this but have no clue how I am supposed to use this##

regexpr("<dd>")

1 个答案:

答案 0 :(得分:0)

此代码应该依赖于dplyr和lubridate包。您尚未提供样本数据。所以这没有经过测试。

library(lubridate)
library(dplyr)

summary <- df %>%
  mutate(end_date = dmy(Production.Period.End.Date),
         production_year  = year(end_date)) %>%
  group_by(production_year) %>%
  summarise(mean_kwH = mean(Period_kWh_Production, na.rm = TRUE),
            sd_kwH = sd(Period_kWh_Production, na.rm = TRUE))