我试图从通过fread()生成的colnames中删除名字。第一列名称仅充当行名称的标题。在工作流的稍后部分,此“标题”确实将我的数据弄乱了,因为它被视为行之一,所以以某种方式,我需要将其忽略或不存在。
我的DGE_file的子集如下:
GENE ATGGCGAACCTACATCCC ATGGCGAGGACTCAAAGT
1: 0610009B22Rik 1 0
2: 0610009E02Rik 0 0
我试图这样删除第一列名称:
library(Matrix)
library("data.table")
# Read in the dge file
DGE_file<- fread(file="DGE.txt", stringsAsFactors = TRUE)
colnames(DGE_file)<-colnames(DGE_file)[-1]
DGE_file<- as.matrix(DGE_file)
可以理解会产生错误:
> colnames(DGE_file)<-colnames(DGE_file)[-1]
Error in setnames(x, value) :
Can't assign 10000 names to a 10001 column data.table
我已经尝试用NA代替它,但是它在下游处理中产生了一个我无法解决的错误。
如何在下游处理中删除标题“基因”或使其不可见?
答案 0 :(得分:1)
以下应该可以工作
library(Matrix)
library("data.table")
# Read in the dge file
DGE_file<- fread(file="DGE.txt", stringsAsFactors = TRUE)
# Set the first column name to the empty string.
names(DGE_file)[1] <- ""
答案 1 :(得分:0)
您可以读取没有标题和第一行的文件,然后设置列名。但是,以我个人的观点,使用没有名称的列名或使用NA
作为名称可能会出现问题。
require(magrittr) # for piping
require(data.table) #For reading with fread
# Read in the dge file
#Without header and skiping the first line
DGE_file <- fread(file="DGE.txt",
skip = 1,
header=FALSE,
stringsAsFactors = TRUE)
#Set the column names (for "invisible" name)
DGE_file <- DGE_file %>%
purrr::set_names(c("", "ATGGCGAACCTACATCCC",
"ATGGCGAGGACTCAAAGT"))
OR
#Set the column names (for NA as the first name)
DGE_file <- DGE_file %>%
purrr::set_names(c(NA, "ATGGCGAACCTACATCCC",
"ATGGCGAGGACTCAAAGT"))
用于添加名称的base R
解决方案如下:
#Read the file with header
DGE_file <- fread(file="DGE.txt",
header=TRUE,
stringsAsFactors = TRUE)
#Set an "inivisible" as a name
names(DGE_file)[1] <- ""
#Or set an NA as a name
names(DGE_file)[1] <- NA