我有这个数据框:
data
structure(list(Time = structure(1:4, .Label = c("2015-01-18 02:00:00",
"2015-01-18 03:00:00", "2015-01-18 04:00:00", "2015-01-18 05:00:00"
), class = "factor"), Server1 = c(12.92, NA, 10, 10.17), Server2 = c(13.42,
NA, 9.42, 10.83), Server3 = c(NA, 9.08, 9.17, 8.58)), .Names = c("Time",
"Server1", "Server2", "Server3"), class = "data.frame", row.names = c(NA,
-4L))
这些是变量:
dc=c("dc1")
type=c("Resource_Utilization")
app=c("DB")
metric=c(".PercentCPU")
我必须能够在单独的打印行中打印每个列数据,如下所示:
Server1.PercentCPU 1422165600 2 Host=Server1 source=WebTier dc=dc1 app=DB type=Resource_Utilization
我目前正在这样做:
for (i in 2:ncol(data)){
data1<-data[i]
data1<-cbind(data[1],data1)
data1<-data1[complete.cases(data1),]
data1$Metric<-paste0(colnames(data[i]),metric)
data1$Time<-as.numeric(data1$Time)
n<-names(data1)
data1$Host=paste0("Host=",n[2])
data1$source=paste0("source=","WebTier")
data1$dc=paste0("dc=",dc)
data1$app=paste0("app=",app)
data1$type=paste0("type=",type)
data1<-data.frame(data1[,c(3,1,2,4,5,6,7,8)])
data1[,3]<-as.numeric(data[,3])*1024
write.table(data1, row.names=F, col.names=F, quote=F)
}
我收到此错误:
Error in `[<-.data.frame`(`*tmp*`, , 3, value = c(13742.08, NA, 9646.08, :
replacement has 4 rows, data has 3
有些细胞会有NA。我需要想出一种方法来处理我脚本中的NA。任何想法我怎么能这样做,以便我只跳过NA的细胞?
答案 0 :(得分:2)
此错误是由
引起的# drop rows with NA's
data1<-data1[complete.cases(data1),]
[lots of calcultions]
# replace all rows of the third column of the original matrix
data1[,3]<-as.numeric(data[,3])*1024
因此,您尝试用较长的列替换短向量(列)。
解决此问题的一种方法是存储索引并在分配期间重新使用它,如:
# drop rows with NA's
validRows <- complete.cases(data1)
data1<-data1[validRows,]
[lots of calcultions]
# replace all rows of the third column of the original matrix that were valid
data1[,3]<-as.numeric(data[validRows,3])*1024