给出data.table如下
DT <- as.data.table(
cbind(PREC_01N=c(0.0,0.25,2.29,9.77,26.00,0.93,0.00,5.54,9.91,0.00,0.01,0.0),
PREC_01P=c(1.73,0.00,0.01,7.55,0.00,0.11,65.09,13.60,7.09,13.87,5.15,0.87),
PREC_02N=c(0.0,0.26,0.00,9.58,1.50,2.46,0.03,4.94,0.00,1.53,6.11,0.02),
PREC_02P=c(0.33,57.20,10.95,2.89,0.81,2.59,0.00,4.63,11.05,1.53,10.43,1.98),
PREC_03N=c(1.26,0.04,0.00,27.25,0.00,3.87,0.01,0.48,17.73,0.05,12.14,0.02),
PREC_03P=c(0.21,5.74,0.00,1.59,23.35,1.36,0.00,3.75,6.14,0.37,0.00,0.00),
PREC_04N=c(0.00,0.34,1.52,15.20,0.00,3.43,0.07,0.00,0.01,15.12,25.55,0.04),
PREC_04P=c(5.42,9.13,20.64,12.68,35.68,27.05,0.00,0.02,0.00,1.60,0.00,0.67),
PREC_05N=c(0.03,3.56,0.08,9.98,0.01,3.94,0.32,0.00,15.58,0.01,0.00,0.00),
PREC_05P=c(0.21,0.02,57.97,0.01,0.00,4.31,0.00,1.55,13.03,0.07,54.75,0.78),
PREC_06N=c(0.19,4.08,0.10,12.22,0.00,0.72,0.03,0.09,15.19,0.01,9.29,0.18),
PREC_06P=c(0.05,0.59,0.29,6.65,35.56,14.02,0.02,0.38,13.46,0.00,1.07,0.00),
PREC_07N=c(0.42,4.50,11.36,3.34,4.04,0.02,0.03,0.00,1.66,0.00,9.44,0.00),
PREC_07P=c(0.35,10.37,13.12,13.24,8.29,30.73,0.72,0.01,9.74,0.75,5.77,0.00),
PREC_AVN=c(1.26,0.00,16.92,13.09,1.43,6.13,0.00,12.10,8.23,1.00,7.99,0.00)
))
为了进行测试,我使用2种不同的方法创建2列,即15列的平均值:
DT[,PREC_MEAN:=rowMeans(DT[,1:15,with=F])] # Create column PREC_MEAN - FASTER
DT[,PREC_MEAN2:=apply(DT[,1:15,with=F], 1, mean)] # Create column PREC_MEAN2 - SLOWER
令我惊讶的是,它们在某些方面有所不同:
identical(DT$PREC_MEAN, DT$PREC_MEAN2) # FALSE ?????
DTbad <- DT$PREC_MEAN != DT$PREC_MEAN2 # Logical vector
sum(DTbad) # 10 inequalities????
DT <- cbind(ROWID=1:nrow(DT),DT) # Adding a ROWID col to create the IDENTICAL column
DT[,IDENTICAL:=identical(PREC_MEAN, PREC_MEAN2), by=ROWID] # By the way, is there another easier way?
12条线中的10条显示它们的MEAN值不同!
DT[, list(PREC_MEAN, PREC_MEAN2, IDENTICAL)] # What is different?
DT[, list(format(PREC_MEAN, scientific = T),format(PREC_MEAN2, scientific = T), IDENTICAL)] # Trying via scientific notation
DT是572.400 x 66 data.table的子集,上面相同的过程显示了我在这里再现的10个差异,并添加了2个更好的案例,第1个和最后一个。
有谁知道发生了什么?为什么会出现这种差异?
事先提前。