我需要将数据框转换为数字矩阵。但是,当我使用data.frame
函数时,小数转换为不同的数字,我不知道为什么。有人可以告诉我发生了什么事吗?
> head(x[,1:5])
TCGA-AA-3520-01A-01R-0821-07 TCGA-AA-3532-01A-01R-0821-07 TCGA-AA-3553-01A-01R-0821-07 TCGA-A6-2674-01A-02R-0821-07 TCGA-AA-3521-01A-01R-0821-07
ELMO2 -0.840833333333333 0.018 0.354916666666667 -0.203750 0.6890000
CREB3L1 1.333 0.7625 0.13475 2.498750 1.1572500
RPS11 1.4755 0.3245 0.634 0.483125 0.9526250
PNMA1 -1.39075 -1.48725 -0.8305 -0.463250 -2.2230000
MMP2 0.0278333333333333 -0.2065 0.0666666666666666 2.156000 0.1501667
C10orf90 -2.5495 -2.76575 -2.76375 -2.482250 -2.1107500
> head(data.matrix(x[,1:5]))
TCGA-AA-3520-01A-01R-0821-07 TCGA-AA-3532-01A-01R-0821-07 TCGA-AA-3553-01A-01R-0821-07 TCGA-A6-2674-01A-02R-0821-07 TCGA-AA-3521-01A-01R-0821-07
ELMO2 3323 94 1701 -0.203750 0.6890000
CREB3L1 4307 3022 654 2.498750 1.1572500
RPS11 4485 1458 2786 0.483125 0.9526250
PNMA1 4379 4438 3397 -0.463250 -2.2230000
MMP2 155 932 328 2.156000 0.1501667
C10orf90 5139 5193 5230 -2.482250 -2.1107500
> class(x)
[1] "data.frame"
> str(x)
'data.frame': 6150 obs. of 174 variables:
$ TCGA-AA-3520-01A-01R-0821-07: Factor w/ 5538 levels "","0","0.000166666666666662",..: 3323 4307 4485 4379 155 5139 4177 1400 4735 3363 ...
$ TCGA-AA-3532-01A-01R-0821-07: Factor w/ 5597 levels "","0.000499999999999968",..: 94 3022 1458 4438 932 5193 1374 2757 4671 2503 ...
$ TCGA-AA-3553-01A-01R-0821-07: Factor w/ 5550 levels "","0.000249999999999995",..: 1701 654 2786 3397 328 5230 65 194 4900 3966 ...
$ TCGA-A6-2674-01A-02R-0821-07: num -0.204 2.499 0.483 -0.463 2.156 ...
$ TCGA-AA-3521-01A-01R-0821-07: num 0.689 1.157 0.953 -2.223 0.15 ...
$ TCGA-AA-3534-01A-01R-0821-07: num -0.6789 -0.0877 1.5736 -1.6678 -0.7148 ...
$ TCGA-AA-3555-01A-01R-0821-07: Factor w/ 5580 levels "","-0.00012499999999999",..: 373 4970 2076 519 1344 5084 3882 1285 4760 2778 ...
$ TCGA-A6-2670-01A-02R-0821-07: num 0.588 0.569 0.808 -1.661 1.073 ...
$ TCGA-A6-2683-01A-01R-0821-07: num -0.77 0.741 1.564 -2.984 -1.569 ...
$ TCGA-AA-3526-01A-02R-0821-07: num -0.824 2.215 0.819 -1.846 -0.862 ...
$ TCGA-A6-2677-01A-01R-0821-07: num -0.733 0.526 0.892 -1.598 -1.69 ...
$ TCGA-AA-3522-01A-01R-0821-07: num -0.981 2.094 0.818 -1.048 -1.452 ...
$ TCGA-AA-3538-01A-01R-0821-07: num -0.144 0.631 0.794 -1.523 -0.198 ...
$ TCGA-AA-3556-01A-01R-0821-07: Factor w/ 5556 levels "","-0.000125000000000014",..: 2256 4772 3446 4253 4040 4927 3026 316 3766 3221 ...
$ TCGA-A6-2678-01A-01R-0821-07: num -1.38 1.706 1.103 -2.725 -0.918 ...
$ TCGA-AA-3524-01A-02R-0821-07: Factor w/ 5611 levels "","-0.0005","0.000500000000000006",..: 4062 3671 4749 4751 4051 5226 2623 1227 4252 1489 ...
$ TCGA-AA-3542-01A-02R-0821-07: num -1.195 0.641 1.952 -1.63 -1.264 ...
$ TCGA-AA-3558-01A-01R-0821-07: Factor w/ 5580 levels "","0.000375000000000007",..: 4245 3920 4277 4910 4766 5126 1450 3350 4898 1915 ...
$ TCGA-AA-3544-01A-01R-0821-07: num -0.157 0.649 0.937 -1.941 -1.417 ...
$ TCGA-AA-3560-01A-01R-0821-07: num -0.146 0.554 0.581 -2.503 -0.438 ...
$ TCGA-AA-3514-01A-02R-0821-07: Factor w/ 5678 levels "","0","0.000375000000000028",..: 3800 2056 2422 1158 1507 4620 3564 1877 5480 4076 ...
$ TCGA-AA-3527-01A-01R-0821-07: num -0.3973 -0.0915 1.4019 -2.5513 -0.395 ...
$ TCGA-AA-3548-01A-01R-0821-07: Factor w/ 5470 levels "","0.000100000000000011",..: 2590 3817 3388 4531 2770 4922 2715 406 4473 2711 ...
$ TCGA-AA-3561-01A-01R-0821-07: num -1.115 1.01 1.266 -1.419 -0.537 ...
$ TCGA-AA-3517-01A-01R-0821-07: Factor w/ 5604 levels "","-0.000333333333333335",..: 479 1182 4514 5003 4005 4799 1499 4796 849 3079 ...
$ TCGA-AA-3529-01A-02R-0821-07: Factor w/ 5583 levels "","-0.000124999999999978",..: 2912 3970 4073 4555 4257 5238 3242 2668 899 3508 ...
$ TCGA-AA-3549-01A-02R-0821-07: Factor w/ 5538 levels "","0.000166666666666671",..: 1378 4762 4356 4857 519 4739 1254 4777 350 444 ...
$ TCGA-AA-3562-01A-02R-0821-07: Factor w/ 5628 levels "","0","0.000249999999999993",..: 2453 3556 3523 4987 2236 5148 1681 1854 2249 4096 ...
答案 0 :(得分:1)
data.matrix()
函数使用内部代码将因子转换为数字。这就是他们在使用data.matrix()
后在数据框中列为因素并具有不同值的原因。要在这种情况下创建数字矩阵,请尝试以下方法:
y <- apply(as.matrix(x[, 1:5]), 2, as.numeric)
使用as.matrix()
时,因子成为字符串。使用apply()
会将所有内容转换为数字而不会丢失矩阵结构。
正如斯蒂芬亨德森在评论中所提到的,尝试找出为什么存储在数据框中的数值被视为因素是一个好主意。