下面是我的r脚本,用于规范化数据集,然后使用dist()查找欧氏距离。
normalize_data <- function (wine_data1)
{
feature_mat <- (wine_data1[,2:14]) #first column is class label hence normalizing only features
#print(feature_mat)
min_vector=apply(feature_mat, 2, min)
max_vector=apply(feature_mat, 2, max)
feature_mat <- sweep(feature_mat, 2, min_vector, FUN="-")
feature_mat =sweep(feature_mat, 2, min_vector-max_vector, FUN="/")
distance_mat <- dist(feature_mat, method = "euclidean", diag = FALSE, upper= FALSE)
closest_dist <- apply(distance,1,min)
return(closest_dist)
}
normdata <-normalize_data(winedata)
normdata1 <- cbind(winedata[,1],normdata) #here i am binding the class labels with the normalized data
My euclidean distance matrix looks like below:
[1, 0, 1.2, 1.3, 1.4]
[1, 1.2, 0, 3.4, 5.1]
[2, 1.3, 1.7, 0, 3.4]
[2, 1.4, 1.9, 2.0, 0]
第一列是类标签。对于每一行,我只想找到距离较小的一行。 所以我想要这样的东西:
[class label1,classlabel2,minvalue]
classlabel1是第一个点的标签,classlabel2是行(点)的标签,距离最小,minvalue是一个点与其他点之间的最小距离
但是当我申请时(dist_mat,1,min)我得到全部为零。
非常感谢任何帮助。 提前致谢
答案 0 :(得分:0)
如果我明白你的要求,你只需要做以下事情:
mat<- matrix(c(0,1,3,4,
3,0,2,4,
1,2,0,6,
4,8,2,0),ncol=4,byrow=T,dimnames =list(c("R1","R2","R3","R4"),
c("C1","C2","C3","C4")))
apply(mat,1,function(x)colnames(mat)[order(x)[2]])
您将获得列的名称,其最小值不为零:
R1 R2 R3 R4
"C2" "C3" "C1" "C3"