R编程按月提取平均温度

时间:2019-09-23 03:15:36

标签: r csv dataframe

我有一个数据框,我也添加了一个新列“ Month”(data $ Month)。 我的目标是能够按月提取平均最低温度并将其显示出来。

monthly_maximum_mean <- data %>% group_by(data$Month) %>% summarize(mean(data$Maximum_temperature))
monthly_maximum_mean 

尽管我遇到一些问题。

有人告诉我我需要将“月份”列设置为“序数”(以为是)

#Change the type of "Month" column from Character to Ordinal with levels as the number of months(i.e. 13)
data$Month <- as.ordered(data$Month)
typeof(data$Month)

但是,这似乎并不符合我的意愿。

R是本周的新手,所以仍然要疯狂地弄清楚所有的螺母和螺栓。

到目前为止,这是我的代码

# load the tidyverse library
library("tidyverse")
setwd("C:/Users/ibrahim.cetinkaya/OneDrive - NTT/Desktop/data")
##################### Part A #####################
# data files (you need to specify the paths of the CSV files (e.g. relativeor absolute) )
files <- c("data/201808.csv",
       "data/201809.csv",
       "data/201810.csv",
       "data/201811.csv",
       "data/201812.csv",
       "data/201901.csv",
       "data/201902.csv",
       "data/201903.csv",
       "data/201904.csv",
       "data/201905.csv",
       "data/201906.csv",
       "data/201908.csv"
)

#Concatenate into one data frame. 
data <- data.frame()
for (i in 1:length(files)){
temp <- read_csv(files[i], skip = 7)
data <- rbind(data, temp)
}
#View to verify
view(data)

#Part 2
#Remove vairables which have no data at all (All the data are na's)
#Remove variables that doesn't have adequate data (70% of the number of records are NA's)
data <- data[rowMeans(is.na(data))<=0.9,]
view(data)


#Change the column names to have no spaces between the words

names(data) <- gsub(" ", "_", names(data))
view(data)


#Convert Date to date type

data$Date <- as.Date(data$Date, "%d/%m/%Y")

#Checking if it worked
typeof(data$Date)
view(data)


#Add a new column and name it "Month", you may extract the contents of this column from the "Date" 
column
data$Month <- format(data$Date, "%m")
view(data)

#Change the type of "Month" column from Character to Ordinal with levels as the number of months 
(i.e. 13)
data$Month <- as.ordered(data$Month)
typeof(data$Month)

#For all of the numeric columns, replace the remaining NAs with the median value of the values in 
the column.

#Part 3
#Show summary of min_temp, max_temp, 9am_temp, 3pm_temp, speedofgust
summary(data$Minimum_temperature)
summary(data$Maximum_temperature)
summary(data$`9am_Temperature`)
summary(data$`3pm_Temperature`)
summary(data$`Speed_of_maximum_wind_gust_(km/h)`)

#Extract the mean of minimum temperature by month
data <- fct_explicit_na(data$Month)
monthly_mean <- data %>% group_by(data$Month) %>% summarise(mean = mean(data$Minimum_temperature))

#Extract the mean of maximum temperature by month
monthly_maximum_mean <- data %>% group_by(data$Month) %>% summarize(mean(data$Maximum_temperature))
monthly_maximum_mean 

0 个答案:

没有答案