这篇文章是为了更好地理解R中“级别”的工作原理。确实,其他答案也不能完全解释(例如,参见this)。
请考虑以下简短脚本,在该脚本中,我计算随机数据帧df
的每一列的RMSE并将值存储为新数据帧bestcombo
的一行
df = as.data.frame(matrix(rbinom(10*1000, 1, .5), nrow = 10, ncol=5))
#generate empty dataframe and assign col names
bestcombo = data.frame(matrix(ncol = 2, nrow = 0))
colnames(bestcombo) = c("RMSE", "Row Number")
#for each col of df calculate RMSE and store together with col name
for (i in 1:5){
RMSE = sqrt(mean(df[,i] ^ 2))
row_num = i
row = as.data.frame(cbind( RMSE, toString(row_num) ))
colnames(row) = c("RMSE", "Row Number")
bestcombo = rbind(bestcombo, row)
}
问题是生成了“级别”。为什么?
bestcombo$RMSE
RMSE RMSE RMSE RMSE RMSE
0.547722557505166 0.774596669241483 0.707106781186548 0.836660026534076 0.707106781186548
Levels: 0.547722557505166 0.774596669241483 0.707106781186548 0.836660026534076
bestcombo$RMSE[1]
RMSE
0.547722557505166
Levels: 0.547722557505166 0.774596669241483 0.707106781186548 0.836660026534076
为什么会发生这种情况以及如何避免呢?这是由于错误使用了rbind()吗?
这还会产生其他问题。例如,订单功能不起作用。
bestcombo[order(bestcombo$RMSE),]
RMSE Random Vector
1 0.547722557505166 1
2 0.774596669241483 2
3 0.707106781186548 3
5 0.707106781186548 5
4 0.836660026534076 4
答案 0 :(得分:3)
您想要更多类似这样的东西:
#for each col of df calculate RMSE and store together with col name
for (i in 1:5){
RMSE = sqrt(mean(df[,i] ^ 2))
row_num = i
row = data.frame(RMSE = RMSE, `Row Number` = as.character(row_num) )
#colnames(row) = c("RMSE", "Row Number")
bestcombo = rbind(bestcombo, row)
}
或者,如果您真的想在第二行中添加列名,请执行以下操作:
for (i in 1:5){
RMSE = sqrt(mean(df[,i] ^ 2))
row_num = i
row = data.frame(RMSE,as.character(row_num) )
colnames(row) = c("RMSE", "Row Number")
bestcombo = rbind(bestcombo, row)
}
仅出于完整性考虑,我将补充一点,尽管这并不是您要解决的问题,但每次rbind
插入一行来增加数据帧,这样都会开始产生