我有一个名为CoeNIST
的10x100数据帧。行按重要性排序(即,行1中的值比行2中的值更重要),并且每列表示不同的样本。我想为每个样本仅提取最重要的非零值,即第一个非零值。
以下是CoeNIST
的前9列中的示例。
> CoeNIST[,1:9]
1 2 3 4 5 6 7 8 9
1 0 352232 0 0 0 0 0 28733 0
2 332829 0 0 380109 0 0 0 380343 0
3 0 0 0 380111 0 0 0 380409 0
4 0 0 0 380101 0 0 0 0 0
5 0 0 299211 380112 0 0 0 0 0
6 0 0 0 380103 0 0 0 0 0
7 0 0 0 380100 0 0 0 71899 0
8 0 0 0 24812 0 0 0 0 0
9 0 0 0 0 0 0 0 380410 0
10 0 332958 0 0 0 0 0 380440 0
这就是我想要的结果
> NIST
[1] 332829 352232 299211 380109 NA NA NA 28733 NA
或......作为清单......
> NIST
[[1]]
[1] 332829
[[2]]
[1] 352232
[[3]]
[1] 299211
[[4]]
[1] 380109
[[5]] integer(0)
[[6]] integer(0)
[[7]] integer(0)
[[8]]
[1] 28733
[[9]] integer(0)
答案 0 :(得分:3)
CoeNIST <- read.table(header=TRUE,text="
1 2 3 4 5 6 7 8 9
1 0 352232 0 0 0 0 0 28733 0
2 332829 0 0 380109 0 0 0 380343 0
3 0 0 0 380111 0 0 0 380409 0
4 0 0 0 380101 0 0 0 0 0
5 0 0 299211 380112 0 0 0 0 0
6 0 0 0 380103 0 0 0 0 0
7 0 0 0 380100 0 0 0 71899 0
8 0 0 0 24812 0 0 0 0 0
9 0 0 0 0 0 0 0 380410 0
10 0 332958 0 0 0 0 0 380440 0")
我会将您的问题描述为“选择每列中的第一个非零值”。当列中只有零时,我的解决方案会为您提供NA
个值...
apply(CoeNIST,2,function(x) (x[x>0])[1])
## X1 X2 X3 X4 X5 X6 X7 X8 X9
## 332829 352232 299211 380109 NA NA NA 28733 NA
答案 1 :(得分:1)
CoeNIST = matrix(c(0, 352232, 0, 0, 0, 0, 0, 28733, 0, 332829, 0, 0, 380109, 0, 0, 0, 380343, 0, 0, 0, 0, 380111, 0, 0, 0, 380409, 0, 0, 0, 0, 380101, 0, 0, 0, 0, 0, 0, 0, 299211, 380112, 0, 0, 0, 0, 0, 0, 0, 0, 380103, 0, 0, 0, 0, 0, 0, 0, 0, 380100, 0, 0, 0, 71899, 0, 0, 0, 0, 24812, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 380410, 0, 0, 332958, 0, 0, 0, 0, 0, 380440, 0), nrow=10, ncol=10, byrow=T)
> CoeNIST
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 352232 0 0 0 0 0 28733 0 332829
[2,] 0 0 380109 0 0 0 380343 0 0 0
[3,] 0 380111 0 0 0 380409 0 0 0 0
[4,] 380101 0 0 0 0 0 0 0 299211 380112
[5,] 0 0 0 0 0 0 0 0 380103 0
[6,] 0 0 0 0 0 0 0 380100 0 0
[7,] 0 71899 0 0 0 0 24812 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 380410
[9,] 0 0 332958 0 0 0 0 0 380440 0
[10,] 0 352232 0 0 0 0 0 28733 0 332829
这是每列的最大值:
apply(CoeNIST, 2, function(x){x_max = max(x); if(x_max == 0) NULL else x_max})})
这是列中第一个非零值:
apply(CoeNIST, 2, function(x){x_top_non_zero = min(which(x>0)); if(x_top_non_zero == Inf) {NaN} else {x_top_non_zero}} )