我需要对以下数据进行两步分析:
1 5 1 -2
2 6 3 4
1 5 4 -3
NA NA NA NA
2 5 4 -4
步骤1.删除所有NA行(这些行总是整行,而不是单元格) 步骤2.按第4列的值按降序对行进行排序
结果应如下:
2 6 3 4
1 5 1 -2
1 5 4 -3
2 5 4 -4
如何在考虑数据集可能很大(例如100,000个条目)的情况下,如何有效地执行此处理。
答案 0 :(得分:1)
另一种方法是首先删除所有NA值,然后对矩阵进行排序。
# make a matrix
my_mat <- matrix(c(1,2,1,1,2,5,6,5,2,5,1,3,4,2,4,-2,4,-3,2,-4),
nrow = 5, ncol = 4)
# add some NA values
my_mat[4,] <- NA
[,1] [,2] [,3] [,4]
[1,] 1 5 1 -2
[2,] 2 6 3 4
[3,] 1 5 4 -3
[4,] NA NA NA NA
[5,] 2 5 4 -4
# remove rows that contain any number of NAs, for this purpose
# NAs always occupy the entire row as specified in the question
my_mat <- my_mat[complete.cases(my_mat),]
# order by the 4th column
my_mat[order(my_mat[,4], decreasing = TRUE),]
[,1] [,2] [,3] [,4]
[1,] 2 6 3 4
[2,] 1 5 1 -2
[3,] 1 5 4 -3
[4,] 2 5 4 -4