不可更改的列名称

时间:2015-09-12 12:07:15

标签: r

我有一个数据框,其列名恰好按数字顺序排列,并且在将文件读取到工作区后没有转换为X1,X2,X3,..这会导致ggplot2中列的不需要的排序(1,10,11,...,2,21,22)

我尝试更改colnames,但无论我做什么都会被忽略:

data <- read.table(file = "tabbed_text.txt", sep="\t", header=T, row.names=1)
str(data[,1:10])
'data.frame':   1208 obs. of  10 variables:
 $ 1 : int  1147 748 1147 944 841 938 513 645 577 309 ...
 $ 2 : int  2298 1017 1741 1380 1230 1460 696 1050 1006 442 ...
...
colnames(data[,1:10])
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
paste0("V", c(1:10))
 [1] "V1"  "V2"  "V3"  "V4"  "V5"  "V6"  "V7"  "V8"  "V9"  "V10"
colnames(data[,1:10]) <- paste0("V", c(1:10))
colnames(data[,1:10])
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
new.names <- c("I","do","not","understand","why", "this", "is", "happening", "to", "me")
colnames(data[,1:10]) <- new.names
colnames(data[,1:10])
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
str(names(data[,1:10]))
chr [1:10] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"

我在哪里失败?

这是前10x10单元的输出输出:

> dput(data[1:10,1:10])
structure(list(X1 = c(1147L, 748L, 1147L, 944L, 841L, 938L, 513L, 
645L, 577L, 309L), X2 = c(2298L, 1017L, 1741L, 1380L, 1230L, 
1460L, 696L, 1050L, 1006L, 442L), X3 = c(1239L, 634L, 1037L, 
979L, 766L, 624L, 557L, 503L, 425L, 337L), X4 = c(1180L, 393L, 
883L, 699L, 641L, 456L, 478L, 378L, 321L, 227L), X5 = c(1178L, 
650L, 892L, 889L, 767L, 660L, 384L, 547L, 457L, 318L), X6 = c(3135L, 
1137L, 1493L, 1371L, 1024L, 1103L, 846L, 753L, 728L, 425L), X7 = c(1989L, 
807L, 1368L, 1071L, 1154L, 1055L, 662L, 658L, 680L, 435L), X8 = c(4469L, 
1917L, 2524L, 2294L, 1834L, 2082L, 1181L, 1240L, 1392L, 825L), 
    X9 = c(394L, 553L, 666L, 900L, 707L, 673L, 503L, 511L, 478L, 
    323L), X10 = c(619L, 1550L, 2069L, 1710L, 2023L, 1473L, 1137L, 
    1041L, 1069L, 886L)), .Names = c("X1", "X2", "X3", "X4", 
"X5", "X6", "X7", "X8", "X9", "X10"), row.names = c(11541L, 11861L, 
985L, 4702L, 301L, 234L, 5876L, 2530L, 247L, 5843L), class = "data.frame")

有趣的是,这些组合似乎在内部存储在“X-Integer”格式中。 我要更改原始文件中的标题,但我想了解我在这里做错了什么。

1 个答案:

答案 0 :(得分:3)

您有一个带整数列的标准data.frame。列名始终是字符类型。您不能将列名分配给data.frame的子集(嗯,您可以,但它们会立即丢失,这就是没有错误的原因)。也许您希望将值分配给列名称的子集,这可以使用colnames(data)[1:10] <- ...来完成。 data.frame的格式对ggplot2没有用,因为该包更喜欢长格式数据。