按数字列排序字符矩阵

时间:2015-01-13 01:10:12

标签: r matrix character

我正在使用包含数字和字符的csv中的矩阵读取。这是一个较小的矩阵,但基本上我正在使用的是:

[,1] [,2] [,3]         [,4]    [,5]    [,6]    [,7]    [,8]    [,9]
V2  "A"  "1"  "Sample X1"  "34712" "39390" "38858" "38574" "38660" 
V3  "A"  "2"  "Sample X2"  "35333" "39940" "40533" "39936" "40669" 
V4  "A"  "3"  "Sample X3"  "33612" "39601" "38658" "39220" "39465" 
V5  "A"  "4"  "Sample X4"  "34309" "39200" "38597" "39820" "40081" 
V6  "A"  "5"  "Sample X5"  "33637" "39404" "40497" "39388" "40033" 
V7  "A"  "6"  "Sample X6"  "35314" "39522" "40345" "38624" "40306" 
V8  "A"  "7"  "Sample X7"  "35548" "39000" "41408" "38310" "39849" 
V9  "A"  "8"  "Sample X8"  "33972" "39930" "39777" "39582" "39570" 
V10 "A"  "9"  "Sample X9"  "34808" "39857" "39252" "39248" "38465" 
V11 "A"  "10" "Sample X10" "34316" "39798" "39776" "39516" "38812" 
V12 "A"  "11" "Sample X11" "34476" "38581" "39672" "38997" "38794" 
V13 "A"  "12" "Sample X12" "36246" "38809" "37872" "38100" "36925" 
V14 "B"  "1"  "Sample X13" "33642" "40201" "40202" "39320" "40426" 
V15 "B"  "2"  "Sample X14" "33381" "40624" "40349" "41350" "40490" 
V16 "B"  "3"  "Sample X15" "34465" "42096" "41194" "40613" "40416" 
V17 "B"  "4"  "Sample X16" "33957" "41905" "42273" "40710" "40681" 
V18 "B"  "5"  "Sample X17" "33877" "42040" "42226" "40788" "41261" 
V19 "B"  "6"  "Sample X18" "33970" "41860" "41149" "41093" "40877" 
V20 "B"  "7"  "Sample X19" "34745" "42040" "40186" "40862" "41044" 
V21 "B"  "8"  "Sample X20" "34140" "41274" "39880" "40356" "40496" 
V22 "B"  "9"  "Sample X21" "33929" "40652" "41410" "40760" "40718" 
V23 "B"  "10" "Sample X22" "33684" "39220" "40478" "41500" "40094"
V24 "B"  "11" "Sample X23" "33141" "41446" "41121" "40726" "41020"
V25 "B"  "12" "Sample X24" "33405" "38481" "37716" "38562" "38218" 
V26 "C"  "1"  "Sample X25" "71560" "86402" "85614" "84273" "83264" 
V27 "C"  "2"  "Sample X26" "72144" "86266" "88082" "87672" "87356" 
V28 "C"  "3"  "Sample X27" "71946" "90201" "89156" "88386" "88006" 
V29 "C"  "4"  "Sample X28" "71758" "89108" "88225" "86006" "88654" 
V30 "C"  "5"  "Sample X29" "71144" "86558" "88614" "87028" "88809" 
V31 "C"  "6"  "Sample X30" "70504" "89230" "88869" "86653" "86356" 
V32 "C"  "7"  "Sample X31" "67874" "88405" "84878" "84914" "85425" 
V33 "C"  "8"  "Sample X32" "70273" "87865" "87529" "87945" "86172" 

我想在没有标题的情况下按第二列对矩阵进行排序,这样就可以了:

A 1 . . .
B 1
C 1
A 2
B 2
C 2
A 3
. 
.
.
A 12
B 12
C 12 . . .

我环顾四周,发现你可以使用订单:

data <- data[order(data[,2],]

但它是这样的:

A 1 . . .
B 1
c 1
A 10
B 10
C 10
A 11
B 11
C 11
A 12
B 12
C 12
A 2
B 2
C 2
.
.
.
A 9
B 9
C 9 . . .

是因为这个矩阵是一个字符矩阵吗?我如何仅将第二列数字化,以便我可以根据它进行排序?

由于

1 个答案:

答案 0 :(得分:1)

如果要在列之间混合使用类(例如数字和字符),那么将数据放在矩阵中是个坏主意。相反,您应该使用数据帧。

理想情况下,使用read.csvread.table将数据读入数据框。否则,将您的矩阵强制转换为as.data.frame的数据框。

给定矩阵m(在您的情况下为data):

d <- as.data.frame(m, stringsAsFactors=FALSE)
d[, 3] <- as.numeric(d[, 3]) # coerce the relevant column to numeric
d[order(d[, 3]), ]

请注意,可以根据需要<{1}} 对矩阵进行排序,但结果列仍然是m[order(as.numeric(m[, 3])), ]

注意:您目击的排序行为的解释是,对于字符向量,任何以character开头的内容(例如1)都会出现在10之前。