我正在尝试将列组合成多个文件,但是我收到的错误消息会影响我的某些文件的合并。我不确定错误发生在哪里,任何想法?
file_list <- list.files(pattern = "*.mirna")
library(data.table)
lst <- lapply(file_list, function(x)
fread(x, select=c("mir", "seq", "freq","mism","add","t5","t3"))[,
list(ID=paste(mir, seq, mism,add,t5,t3), freq=freq)])
miraligner <- as.data.frame(Reduce(function(x,y) x[y, on = "ID"], lst))
head(miraligner)
Warning messages: 1: In fread(x, select = c("mir", "seq", "freq", "mism", "add", "t5", : Bumped column 9 to type character on data row 6, field contains 'g'. Coercing previously read values in this column from logical, integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE. 2: In fread(x, select = c("mir", "seq", "freq", "mism", "add", "t5", : Bumped column 9 to type character on data row 16, field contains 't'. Coercing previously read values in this column from logical, integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.
我的文件如下:
> head(Xfile)
seq name freq mir start end mism add t5 t3 s5 s3 DB precursor ambiguity
1 AACTGGTTGAACAACTGAACC seq_100018_x3 3 hsa-miR-582-3p 54 74 0 0 t 0 ATTGTAAC AACCCAAA miRNA hsa-mir-582 1
2 TAGCACCATTTGAAATCAGTGT seq_10002_x43 43 hsa-miR-29b-3p 52 73 0 0 0 t TATCTAGC TGTTTTAG miRNA hsa-mir-29b-2 1
3 TGAGTGTGTGTGTGTGAGTGTGTGTTTT seq_100046_x3 3 hsa-miR-574-5p 25 49 0 I-TTT 0 GT CGTGTGAG GTGTGTCG miRNA hsa-mir-574 1
4 GTCATACACGGCTCTCCTCTC seq_100072_x3 3 hsa-miR-485-3p 46 66 0 0 0 t GCGAGTCA CTCTTTTA miRNA hsa-mir-485 1
5 CTGGACTTGGAGTCAGAAGGCAC seq_100077_x3 3 hsa-miR-378a-3p 44 64 0 I-AC a 0 TAGCACTG AGGCCT miRNA hsa-mir-378a 1
6 TAACACTGTCTGGTAACGATGGT seq_100080_x3 3 hsa-miR-200a-3p 54 74 0 I-GT 0 t ACTCTAAC ATGTTCAA miRNA hsa-mir-200a 1
答案 0 :(得分:1)
你不必关心这个。
您的第9列(t5)包含0或字母。
fread
尝试根据少数记录自动转换变量的类型(5)。
对于那些5个第一个记录仅包含0的文件,它将autoguess作为数字。比如当遇到“t”或“a”时,它会切换到字符,足以告诉你。