读取和计数连续点

时间:2018-11-23 16:24:27

标签: r data.table

我在从data.table的2D空间读取坐标时遇到如下问题,并从中读出不同的质量:

DT <- data.table(
                                      A = c(rep("aa",2),rep("bb",2)),
                                      B = c(rep("H",2),rep("Na",2)),
                                      Low = c(0,3,1,1),
                                      High = c(8,10,9,8),
                                      Time =c("0,1,2,3,4,5,6,7,8,9,10","0,1,2,3,4,5,6,7,8,9,10","0,1,2,3,4,5,6,7,8,9,10","0,1,2,3,4,5,6,7,8,9,10"),
                                      Intensity = c("0,0,0,0,561464,0,0,0,0,0,0","0,0,0,6548,5464,5616,0,0,0,68716,0","5658,12,6548,6541,8,5646854,54565,56465,546,65,0","0,561464,0,0,0,0,0,0,0,0,0")

                     )

“时间”和“强度”列是指2D空间的x和y值。 “低”和“高”列是指x轴上的边界(“时间”)。 现在,我想检查(<>)这些边界中y(“强度”)维的不同质量:

  1. 最大连续点数> 0 :(第1行:1,第2行:2 ...)
  2. 总点数> 0 :(行1:1,行2:3,..)
  3. 连续点的最大数量>基线(基线值应取自下边界或下边界的强度值,该值越小(因此,对于第3行,该值为12,其他为0)):( row3:4,其他所有行均与1中的相同。)

所以输出应该是这样的表:

DT <- data.table(
                              A =c(rep("aa",2),rep("bb",2)),
                              B =c(rep("H",2),rep("Na",2)),
                              Low = c(0,3,1,1),
                              High = c(8,10,9,8),
                              Time = c("0,1,2,3,4,5,6,7,8,9,10","0,1,2,3,4,5,6,7,8,9,10","0,1,2,3,4,5,6,7,8,9,10","0,1,2,3,4,5,6,7,8,9,10"),
                              Intensity = c("0,0,0,0,561464,0,0,0,0,0,0","0,0,0,6548,5464,5616,0,0,0,68716,0","5658,12,6548,6541,8,5646854,54565,56465,546,65,0","0,561464,0,0,0,0,0,0,0,0,0"),
                              First = c(1,2,7,0),
                              Second= c(1,3,7,0),
                              Third = c(1,2,4,0)
                  )

有人知道如何处理该任务吗?到目前为止,我一直在尝试使用data.table,但如果有人知道用于此类任务的更好包装,我也会很高兴。

非常感谢您!

Yasel

1 个答案:

答案 0 :(得分:2)

这是base R的一种方法。我们将split的“强度”,“时间”列, list中,然后遍历list的相应元素以及“高”元素,在“低”列中,根据从“低”到“高”的索引提取“强度”中的值,检查其是否大于0(还基于对“低”中值的条件检查)。使用rle查找大于0(或“低”索引)的连续元素length。使用原始数据集创建data.framerbindlist的内容

cbind