Question

假设我有一个大数据表，如下所示：

Sequence A1 B1 A2 B2 s1 0 2 9 11 s2 1 3 3 2 s3 2 2 4 1 s4 3 5 4 14 s5 3 7 2 0 s6 0 2 8 5 . . . . . . . . . . . . . . .

我想计算一些操作，如log2（A2 / A1）＆amp; log2（B2 / B1）并返回带有列名称的数据表＆＃34; A2 / A1＆＃34;和＆＃34; B2 / B1＆＃34;看起来像这样：

Sequence A2/A1 B2/B1 s1 log2(9/0) log2(11/2) s2 log2(3/1) log2(2/3) s3 log2(4/2) log2(1/2) s4 log2(4/3) log2(14/5) s5 log2(2/3) log2(0/7) s6 log2(8/0) log2(5/2)

我已经找到了一种解决方法，但它运作正常。由于列的选择是动态发生的（在UI中），我无法真正使用它，并且我仍然获得所有列（A1，B1，A2，B2和A2 / A1 B2 / B1）。

selectInput("firstSelection", "Select First Factor", choices = "", multiple = T, 
helpText("First parameter for the calculation of Regulation-Factor")),
selectInput("secondSelection", "Select Second Factor", choices = "", multiple = T,
helpText("Second parameter for the calculation of Regulation-Factor"))

Hier是我的解决方法：

input_table <<- getData()[, paste(input$secondSelection, input$firstSelection,sep= "/"):=
list(get(input$secondSelection[1])/get(input$firstSelection[1]),
get(input$secondSelection[2])/get(input$firstSelection[2]))]

我想这一定是更好的方法，可能会使用应用等功能或 .I ， .SD 等参数， .SDColms 。我读到了它们，但仍然没有真正了解如何以及何时使用它们。

Answer 1

我们可以使用set函数来完成此操作。使用原始数据集中的第一列“序列”创建结果数据集（'res'），其中两列由NA占用。然后，通过循环“j1”中指定的索引，set这些列中的值，对“dt1”中的列进行子集，除以log2。

res <- data.table(Sequence = dt1$Sequence, A2A1= NA_real_, B2B1=NA_real_)
j1 <- as.integer(seq_len(uniqueN(sub("\\d+", "", names(dt1)[-1]))) + 1)

for(j in j1){
  set(res, i = NULL, j= j, value = log2(dt1[[j+2]]/dt1[[j]]))
}
res
#    Sequence       A2A1       B2B1
#1:       s1        Inf  2.4594316
#2:       s2  1.5849625 -0.5849625
#3:       s3  1.0000000 -1.0000000
#4:       s4  0.4150375  1.4854268
#5:       s5 -0.5849625       -Inf
#6:       s6        Inf  1.3219281

log2(9/0)
#[1] Inf
log2(11/2)
#[1] 2.459432

数据

dt1 <- structure(list(Sequence = c("s1", "s2", "s3", "s4", "s5", "s6"
 ), A1 = c(0L, 1L, 2L, 3L, 3L, 0L), B1 = c(2L, 3L, 2L, 5L, 7L, 
 2L), A2 = c(9L, 3L, 4L, 4L, 2L, 8L), B2 = c(11L, 2L, 1L, 14L, 
 0L, 5L)), .Names = c("Sequence", "A1", "B1", "A2", "B2"), 
 class = "data.frame", row.names = c(NA, -6L))
setDT(dt1)

在数据表中成对运行2列，并用新列的名称替换它们

1 个答案:

数据