我有一个包含函数的R脚本,我在这个问题的答案中收到了这个函数:R: For loop nested in for loop。
脚本在我的数据集的第一部分工作正常,但我现在正试图在另一部分上使用它,据我所知,它具有与第一部分完全相同的格式,但对于某些部分我在尝试使用脚本时遇到错误的原因。我无法弄清楚是什么导致错误。
这是我正在使用的脚本:
require(data.table)
MappingTable_Calibrated = read.csv2(file.choose(), header=TRUE)
head(MappingTable_Calibrated)
#The data is sorted primarily after Scaffold number in ascending order, and secondarily after Cal_Startgen in ascending order.
MappingTable_Calibratedord = MappingTable_Calibrated[order(MappingTable_Calibrated$Scaffold, MappingTable_Calibrated$Cal_Startgen),]
head(MappingTable_Calibratedord)
dt <- data.table(MappingTable_Calibratedord, key = "Scaffold,Cal_Startgen")
head(dt)
# The following function creates pairs of loci for each scaffold.
# The function is a modified version of a function found retrieved from http://www.stackoverflow.com
fn = function(dtIn,id){
# Creates the object dtHead containing as many lines as in dtIn minus the last line)
dtHead = head(dtIn, n = nrow(dtIn) - 1)
# The names of dtHead are appended with _a. paste0 short for: paste(x, sep="")
setnames(dtHead, paste0(colnames(dtHead), "_a"))
# Creates the object dtTail containing as many lines as in dtIn minus the first line)
dtTail = tail(dtIn, n = nrow(dtIn) - 1)
# The names of dtTail are appended with _b.
setnames(dtTail, paste0(colnames(dtTail), "_b"))
# dtHead and dtTail are combined. Scaffold is defined as id. The blank column "Pairwise_Distance is added to the table.
cbind(dtHead, dtTail, Scaffold = id, Pairwise_Distance = 0)
}
#The function is run on the data. .SDcols defines the rows to be included in the output.
output = dt[, fn(.SD, Scaffold), by = Scaffold, .SDcols = c("Name", "Startpos", "Endpos", "Rev", "Startgen", "Endgen", "Cal_Startgen", "Cal_Endgen", "Length")]
output = as.data.frame(output[, with = FALSE])
但是当试图创造&#34;输出&#34;我收到以下错误:
Error in data.table(..., key = key(..1)) : Item 1 has no length. Provide at least one item (such as NA, NA_integer_etc) to be repeated to match the 2 rows in the longest column. Or, all columns can be 0 length, for insert()ing rows into.
dt看起来像这样:
Name Length Startpos Endpos Scaffold Startgen Endgen Rev Match Cal_Startgen Cal_Endgen
1: Locus_7173 144 0 144 34 101196 101340 1 1 101196 101340
2: Locus_133 110 0 110 34 223659 223776 1 1 223659 223776
3: Locus_2746 161 0 89 65 101415 101504 1 1 101415 101576
完全输入&#34; dt&#34;可在此处找到:https://www.dropbox.com/sh/3j4i04s2rg6b63h/AADkWG3OcsutTiSsyTl8L2Vda?dl=0
答案 0 :(得分:5)
首先跟踪导致错误的数据:
function(dtIn, id){
dtHead = head(dtIn, n = nrow(dtIn) - 1)
setnames(dtHead, paste0(colnames(dtHead), "_a"))
dtTail = tail(dtIn, n = nrow(dtIn) - 1)
setnames(dtTail, paste0(colnames(dtTail), "_b"))
r <- tryCatch(cbind(dtHead, dtTail, Scaffold = id, Pairwise_Distance = 0), error = function(e) NULL)
if(is.null(r)) browser()
r
}
然后你可以看到你正在尝试cbind
不同nrow / length的元素:
Browse[1]> dtHead
Empty data.table (0 rows) of 9 cols: Name_a,Startpos_a,Endpos_a,Rev_a,Startgen_a,Endgen_a...
Browse[1]> dtTail
Empty data.table (0 rows) of 9 cols: Name_b,Startpos_b,Endpos_b,Rev_b,Startgen_b,Endgen_b...
Browse[1]> id
[1] 76
Browse[1]> 0
[1] 0
不允许这样做
我建议添加if(nrow(
或类似内容,然后为nrow = 0个案例添加列id = integer(), Pairwise_Distance = numeric()
。