我有一个像这样的矩阵
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Data;
using System.Windows.Documents;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using System.Windows.Navigation;
using System.Windows.Shapes;
我想根据权重从矩阵中逐一消除每一列。在每次迭代时,应删除权重最小的列。
ID 885038 885039 885040 885041 885042 885043 Class
weights 0 0.005 0 0.018 0 0.007 N/A
1267359 2 0 0 0 0 1 1
1295720 2 0 1 0 0 1 1
1295721 2 0 0 0 0 1 1
1295723 2 0 0 0 0 1 1
1295724 2 0 1 0 1 1 1
为此任务编写的代码是
Such that iteration 1: 885041,885040,885042 (columns would be removed)
iteration 2: 885039 (removed)
iteration 3: 885043 (removed)
iteration 4: 885041 (removed)
但是,它给出了输出
library(readr)
dummy <- read_csv("dummy.csv")
dummy <- t(dummy)
colnames(dummy) <- dummy[1,]
dummy <- dummy[-1,]
for (i in 1:ncol(dummy))
{
min <- as.matrix(which.min(dummy[,1] == min(dummy[,1])))
filter <- row.names(min)
data<- dummy[setdiff(rownames(dummy),filter),]
data <- data[-1,]
print(ncol(data))
}
我想要打印的列数
[1] 6
[1] 6
[1] 6
[1] 6
[1] 6
[1] 6
这里是否可以使用一些套用功能?
答案 0 :(得分:1)
您的数据似乎是字符串矩阵。您需要将它们转换为数字。 如Sotos所要求的那样,一个可复制的示例会很好。从您提供示例的方式中我们无法知道您的数据类型是字符串还是数字。
# reproduce your data for you
df = data.frame(matrix(data=
c("weights",0, 0.005, 0, 0.018, 0, 0.007, "N/A",
1267359,2, 0, 0, 0, 0, 1, 1,
1295720,2, 0, 1, 0, 0, 1, 1,
1295721,2, 0, 0, 0, 0, 1, 1,
1295723,2, 0, 0, 0, 0, 1, 1,
1295724,2, 0, 1, 0, 1, 1, 1),
ncol = 8, byrow = TRUE
), stringsAsFactors = FALSE)
colnames(df) = c("ID","885038", "885039", "885040", "885041", "885042", "885043", "Class")
df2 = df[, !colnames(df)%in%c("Class")]
# dummy <- read_csv("dummy.csv")
dummy = df2
dummy <- t(dummy)
colnames(dummy) <- dummy[1,]
dummy <- dummy[-1,]
# "dummy" is a matrix of string. You need a data.frame of numeric.
# weights 1267359 1295720 1295721 1295723 1295724
# 885038 "0" "2" "2" "2" "2" "2"
# 885039 "0.005" "0" "0" "0" "0" "0"
# 885040 "0" "0" "1" "0" "0" "1"
# 885041 "0.018" "0" "0" "0" "0" "0"
# 885042 "0" "0" "0" "0" "0" "1"
# 885043 "0.007" "1" "1" "1" "1" "1"
class(dummy) # [1] "matrix"
dummy = data.frame(dummy)
dummy$weights = as.numeric(as.character( dummy$weights))
class(dummy$weights) # [1] "numeric"
data = dummy
for (i in 1:ncol(dummy))
{
rowMin = which.min(data$weights)
print(paste(nrow(data),rownames(data)[rowMin]))
data = data[-rowMin,]
}
# [1] "6 885038"
# [1] "5 885040"
# [1] "4 885042"
# [1] "3 885039"
# [1] "2 885043"
# [1] "1 885041"