假设我有一个大数据表,如下所示:
Sequence A1 B1 A2 B2
s1 0 2 9 11
s2 1 3 3 2
s3 2 2 4 1
s4 3 5 4 14
s5 3 7 2 0
s6 0 2 8 5
. . . . .
. . . . .
. . . . .
我想计算一些操作,如log2(A2 / A1)& log2(B2 / B1)并返回带有列名称的数据表" A2 / A1"和" B2 / B1"看起来像这样:
Sequence A2/A1 B2/B1
s1 log2(9/0) log2(11/2)
s2 log2(3/1) log2(2/3)
s3 log2(4/2) log2(1/2)
s4 log2(4/3) log2(14/5)
s5 log2(2/3) log2(0/7)
s6 log2(8/0) log2(5/2)
我已经找到了一种解决方法,但它运作正常。由于列的选择是动态发生的(在UI中),我无法真正使用它,并且我仍然获得所有列(A1,B1,A2,B2和A2 / A1 B2 / B1)。
selectInput("firstSelection", "Select First Factor", choices = "", multiple = T,
helpText("First parameter for the calculation of Regulation-Factor")),
selectInput("secondSelection", "Select Second Factor", choices = "", multiple = T,
helpText("Second parameter for the calculation of Regulation-Factor"))
Hier是我的解决方法:
input_table <<- getData()[, paste(input$secondSelection, input$firstSelection,sep= "/"):=
list(get(input$secondSelection[1])/get(input$firstSelection[1]),
get(input$secondSelection[2])/get(input$firstSelection[2]))]
我想这一定是更好的方法,可能会使用应用等功能或 .I , .SD 等参数, .SDColms 。我读到了它们,但仍然没有真正了解如何以及何时使用它们。
答案 0 :(得分:1)
我们可以使用set
函数来完成此操作。使用原始数据集中的第一列“序列”创建结果数据集('res'),其中两列由NA占用。然后,通过循环“j1”中指定的索引,set
这些列中的值,对“dt1”中的列进行子集,除以log2
。
res <- data.table(Sequence = dt1$Sequence, A2A1= NA_real_, B2B1=NA_real_)
j1 <- as.integer(seq_len(uniqueN(sub("\\d+", "", names(dt1)[-1]))) + 1)
for(j in j1){
set(res, i = NULL, j= j, value = log2(dt1[[j+2]]/dt1[[j]]))
}
res
# Sequence A2A1 B2B1
#1: s1 Inf 2.4594316
#2: s2 1.5849625 -0.5849625
#3: s3 1.0000000 -1.0000000
#4: s4 0.4150375 1.4854268
#5: s5 -0.5849625 -Inf
#6: s6 Inf 1.3219281
log2(9/0)
#[1] Inf
log2(11/2)
#[1] 2.459432
dt1 <- structure(list(Sequence = c("s1", "s2", "s3", "s4", "s5", "s6"
), A1 = c(0L, 1L, 2L, 3L, 3L, 0L), B1 = c(2L, 3L, 2L, 5L, 7L,
2L), A2 = c(9L, 3L, 4L, 4L, 2L, 8L), B2 = c(11L, 2L, 1L, 14L,
0L, 5L)), .Names = c("Sequence", "A1", "B1", "A2", "B2"),
class = "data.frame", row.names = c(NA, -6L))
setDT(dt1)