我是data.table的新手,我正在尝试学习它并尝试从data.frame转移到data.table。
现在,我正在尝试将文本拆分为新列,并且我正在讨论here。
这就是我想要做的事情。
这是一个示例数据:
# sample data frame
test <- data.table(POS = c(254, 280, 303, 22, 105, 173, 230, 235, 257, 258),
value = c("0/1:15:3:123:12:478:-38.8484,0,-6.94934",
"0/0:15:15:577:0:0:0,-4.51545,-52.25",
"0/0:13:13:276:0:0:0,-3.91339,-25.0455",
"0/0:367:347:13643:0:0:0,-104.457,-1226.73",
"0/0:367:344:13145:5,0,1,0:168,0,41,0:0,-89.9158,-1166.99,-103.554,-1168.49,-1182.1,-100.161,-1165.11,-1178.71,-1178.41,-103.554,-1168.49,-1182.1,-1178.71,-1182.1",
"0/1:344:180:5411:156:4394:-294.227,0,-385.695",
"0/0:352:349:12289:1:12:0,-104.28,-1104.15",
"0/0:352:345:10691:1:12:0,-103.081,-960.583",
"0/0:352:351:13162:1:41:0,-101.868,-1179.6",
"0/0:352:349:12593:0:0:0,-105.059,-1132.45"))
我想使用&#34;将值拆分为不同的列:&#34;具有特定的列名称。下面的代码(我从上面的链接中学到的)完美地做了那个。
test[, c("GT", "DP", "RO", "QR", "AO", "QA", "GL") := tstrsplit(value, ":",
fixed=TRUE)]
但是,是否可以使用R对象代替上面的c(名称)?像这样:
# new column names
namesForm <- c("GT", "DP", "RO", "QR", "AO", "QA", "GL")
然后,使用下面的namesForm:
# use the namesForm as column names
test[, namesForm := tstrsplit(value, ":", fixed=TRUE)]
这给了我警告和不同的输出(给我一个3个变量的data.table;最后一个10个列表,从tstrsplit输出中回收了7个列表)
Warning message:
In `[.data.table`(test, , `:=`(namesForm, tstrsplit(value, ":", :
Supplied 7 items to be assigned to 10 items of column 'namesForm' (recycled leaving remainder of 3 items).
所以我的问题是,是否可以使用R对象/变量代替显式c()?
答案 0 :(得分:2)
您可以使用(namesForm) :=
代替namesForm :=
。
示例:
test2 <- copy(test)
namesForm <- c("GT", "DP", "RO", "QR", "AO", "QA", "GL")
str(test[, c("GT", "DP", "RO", "QR", "AO", "QA", "GL") := tstrsplit(value, ":", fixed=TRUE)])
# Classes ‘data.table’ and 'data.frame': 10 obs. of 9 variables:
# $ POS : num 254 280 303 22 105 173 230 235 257 258
# $ value: chr "0/1:15:3:123:12:478:-38.8484,0,-6.94934" "0/0:15:15:577:0:0:0,-4.51545,-52.25" "0/0:13:13:276:0:0:0,-3.91339,-25.0455" "0/0:367:347:13643:0:0:0,-104.457,-1226.73" ...
# $ GT : chr "0/1" "0/0" "0/0" "0/0" ...
# $ DP : chr "15" "15" "13" "367" ...
# $ RO : chr "3" "15" "13" "347" ...
# $ QR : chr "123" "577" "276" "13643" ...
# $ AO : chr "12" "0" "0" "0" ...
# $ QA : chr "478" "0" "0" "0" ...
# $ GL : chr "-38.8484,0,-6.94934" "0,-4.51545,-52.25" "0,-3.91339,-25.0455" "0,-104.457,-1226.73" ...
# - attr(*, ".internal.selfref")=<externalptr>
str(test2[, (namesForm) := tstrsplit(value, ":", fixed=TRUE)])
# Classes ‘data.table’ and 'data.frame': 10 obs. of 9 variables:
# $ POS : num 254 280 303 22 105 173 230 235 257 258
# $ value: chr "0/1:15:3:123:12:478:-38.8484,0,-6.94934" "0/0:15:15:577:0:0:0,-4.51545,-52.25" "0/0:13:13:276:0:0:0,-3.91339,-25.0455" "0/0:367:347:13643:0:0:0,-104.457,-1226.73" ...
# $ GT : chr "0/1" "0/0" "0/0" "0/0" ...
# $ DP : chr "15" "15" "13" "367" ...
# $ RO : chr "3" "15" "13" "347" ...
# $ QR : chr "123" "577" "276" "13643" ...
# $ AO : chr "12" "0" "0" "0" ...
# $ QA : chr "478" "0" "0" "0" ...
# $ GL : chr "-38.8484,0,-6.94934" "0,-4.51545,-52.25" "0,-3.91339,-25.0455" "0,-104.457,-1226.73" ...
# - attr(*, ".internal.selfref")=<externalptr>