尝试通过列表操作从2d数据生成簇数

时间:2019-07-15 20:10:50

标签: r

我编写了一个脚本,用于从一组文件中提取原始数据作为大小不同的向量。数据本身是由时间组成的,我试图在彼此之间.5秒内发生的所有时间的数据中找到簇。我正在尝试使用向量操纵来解决此问题,但是我无法弄清楚如何获得准确数量的簇(更不用说簇大小了)。

我试图编写一个函数,该函数使用一系列if else语句来逐个比较数字,但这最终导致出现错误的可能性很大:

cluster.count <- function(x, threshold = 0.5){

    len = length(x)


    counter = 0
    clusters = 1

  while (counter<(len-1)){

    if (c(x,1)[counter] - c(0,x)[counter] <= threshold){
            counter = counter +1

  }else{
        clusters = clusters +1
        counter = counter +1
      }

  if ((x[len-1]-x[len-2])<= threshold){
      clusters = clusters +1
    }
  return(clusters)}
}

因此,我决定开始使用向量整形从向量中减去向量的副本(移了一位),以查找哪些值小于我的阈值。这也提出了一些问题:

cluster.count <- function(x, threshold = 0.5){

  x1 = c(x,10) # adding 10 to the end of x, so that x2 can be subtracted from it 
  x2 = c(0,x) # adding 0 to the beginning of x, so that x1 can be subtracted from x1

  x1 = as.numeric(as.character(unlist(x1))) #converting x1 into a vector
  x2 = as.numeric(as.character(unlist(x2))) #converting x2 into a vector

  result1 = (x1-x2) 
  result1 = as.numeric(result1 < threshold)  #check values that are less than the threshold

  result2 = c(result1[-1], 10)

  result3 = result2 - result1


print(result3)
return(result1)
}

对于样本向量c(1.1,1.2,1.3,2,2.2,5)我的当前函数给出了以下返回值:1 0 -1 1 -1 0 10

有比我看到的更简单的方法吗?

0 个答案:

没有答案