我从庞大的数据集中提取了以下数据,
Short.name | Date | Utilization|count |hour
1 XYX-cNum0 |X15.05.19..00.00| 75.57 | 1 | 00
2 ABC-cNum1 |X15.05.19..00.00| 96.08 | 1 | 00
3 DEF-cNum11 |X15.05.19..00.00| 96.01 | 1 | 00
4 ZEF-cNum12 |X15.05.19..00.00| 68.68 | 0 | 00
5 ABC1-cNum13|X15.05.19..00.00| 98.17 | 1 | 00
6 ABC-cNum14 |X15.05.19..00.00| 92.56 | 1 | 00
从上面我必须以值为 count 的形式获取输出,然后将所有 Short.name 填充为行中的总和 请注意小时值是00到23
我尝试了以下代码进行输出:
#Reading the Data in "dat" object
base_data<-read.csv(filelocation,header = TRUE,sep=",")
data_trim<-base_data[,-c(1,3,4,5,6)]
#Data cleansing
library(tidyr)
#Getting required information in the format
gather_data <- data_trim %>% gather(Date,Utlization,-`Short.name`)
#Setting bollean value for finding > 70 utilization and storing in count
gather_data$count <- ifelse(gather_data$Utlization> 70,1,0)
#Extracting hour information from the date and storing in column hour
gather_data$hour<-substr(gather_data$Date, 12, 13)
#Function defination to get the sum and column values
extractShrtName<-function(df,ht){
count_sum<-subset(df,hour == ht & count == 1)
row_value<-count_sum[,1]
ln<-length(which(count_sum$count==1))
return(row_value,ln)
}`
extractShrtName(gather_data,"00")
hour|countofcells|celllist
00 | 893 |XYX-cNum0|ABC-cNum1|...till 893 cell list
14 | 113 |List of all 113 cells
6 | 85 |List of 85 cell