如何列出R矩阵中一行中所有连续值和单个值的索引

时间:2016-06-12 18:54:09

标签: r matrix dataframe match which

我有一个xlsx文件,其中包含一些值为-3的单元格。一些是单个细胞,一些是具有-3值的连续细胞。我正在尝试编写一个R脚本,它找到包含-3的这些单元格的索引,这样对于单个单元格-3值,我得到单个索引,对于连续单元格-3值,我得到了起始和结束指数。

以下matrix文件中的xlsx,其中包含20 columns和2 rows

3.203   3.204   3.205   -3  3.207   3.207   -3  -3  -3  3.206   3.208   3.207   -3  3.264   3.207   3.208   -3  -3  3.209   -3
3.205   3.205   3.205   3.21    3.208   3.208   3.209   -3  -3  3.209   3.211   3.21    3.211   3.211   3.21    -3  3.213   3.211   3.212   3.212

我希望结果看起来像这样(我将-3视为缺失值)。所以,

1  missing value at: ( 1 , 4 ) 
3  missing values starting from: ( 1 , 7 ) to ( 1 , 9 )
1  missing value at ( 1 , 13 ) 
2  missing values starting from: ( 1 , 17) to ( 1 , 18 )  
1  missing value at: ( 1, 20 ) 
2  missing values starting from: ( 2 , 8 ) to ( 2 , 9 )
1  missing value at: ( 2, 16 ) 

这是R脚本,但它给了我错误的结果。我对正确使用索引感到困惑。

fileData <- read.xlsx(filePath, 1, header = FALSE, sep = ",")
dataMatrix <- data.matrix(fileData)

## Find the number of rows and columns in the matrix
numberOfRows <- nrow(dataMatrix)
numberOfColumns <- ncol(dataMatrix)

## Access each value of the dataMatrix, check if it -3
  for (i in 1:numberOfRows)  # for each row
  {
    # Get indexes for -3 value
    missingValueList = which(dataMatrix[i,] == -3); 
    # Find the index after which there is a break (so no consecutive value)
    consecutiveBreaks = which(diff(missingValueList) != 1);
    print(missingValueList)
    print(consecutiveBreaks)

    j=0;

    for(k in 1:length(consecutiveBreaks))
    {
      if(k == 1)
      {
        cat(consecutiveBreaks[k], " missing value at: (",i,",",missingValueList[j+k],")","\n");
      }
      else
      {
        cat("Value of k: ", k, "\n");
        cat(abs(consecutiveBreaks[k]-consecutiveBreaks[k-1]), " missing values starting from: (",i,",",missingValueList[j],")","\n");

      }
      j=j+1;
    }
  }

有人可以帮助我找到理想的解决方案吗?

1 个答案:

答案 0 :(得分:1)

你走了。我认为这应该适用于您的数据:

val = 1;
counter = 1;
temp = matrix();

for (i in 1:nrow(mdata))
{
  for (j in 1:ncol(mdata))
  {
  if (mdata[i,j] == -3)
  {

    while (j <= ncol(mdata))
    {
      if (mdata[i,j + val] == -3)
      {
        counter = counter + 1;
        val = val + 1;
        next;                    
      }
      else
      {
        break;

      }

    }

    if (counter == 1)
    {
      #print(j);
      #print(mdata[i, (j - 1):(j + 1)]);

      temp <- t(as.matrix(mdata[i, (j - 1):(j + 1)]))
      cat("\n This is with counter 1 \n")
      print(temp)
      cat("\n matrix: temp-1", temp[,1],"temp-2", temp[,3],"\n");
      to.avg <- c(temp[,1], temp[,3]);
      avg<-mean(to.avg)
      mdata[i,j] = avg;
    }
    else
    {

      temp <- t(as.matrix(mdata[i,(j - 1):(j + counter)]))
      cat("\n This is with multiple count \n")
      cat(counter,"consecutive values were found, processing accordingly \n")
      print(temp);

      for (k in 0:(counter-1))
      {
        # cat("\n reading temp at the start \n")
        # print(temp)
        cat("\n K is ",(k+1), "and array is",length(temp),"long \n")
        to.avg <- c(temp[,(k+1)], temp[,length(temp)]);
        cat("averaging", temp[,(k+1)],"and", temp[,length(temp)]);
        avg<-mean(to.avg)
        cat("\n average =",avg);
        temp[,(k+2)] = avg;
        # cat("\n reading temp as this \n")
        # print(temp)
        mdata[i,j+k]=avg
      }

    }

  }
  else
  {
    mdata[i,j] = mdata[i,j];
  }

  val = 1;
  counter = 1;

  }

}