我有一个包含因子
的向量的数据集> str(gdp)
'data.frame': 64 obs. of 31 variables:
$ 1 : Factor w/ 62 levels "","1,145.31",..: 1 1 1 53 16 20 22 24 30 32 ...
$ 2 : Factor w/ 64 levels "1,121.93","1,264.63",..: 42 59 10 13 18 16 17 23 25 35 ...
$ 3 : Factor w/ 62 levels "","1,072.07",..: 1 1 1 35 36 39 41 42 45 51 ...
$ 4 : Factor w/ 62 levels "","1,076.03",..: 1 1 1 15 16 21 23 26 27 36 ...
$ 5 : Factor w/ 62 levels "","1,023.09",..: 1 1 1 11 15 19 17 23 21 27 ...
$ 6 : Factor w/ 62 levels "","1,003.81",..: 1 1 1 40 45 46 47 52 56 7 ...
$ 7 : Factor w/ 62 levels "","1,137.23",..: 1 1 1 13 15 19 21 23 24 28 ...
$ 8 : Factor w/ 62 levels "","1,198.30",..: 1 1 1 26 31 34 35 39 40 47 ...
$ 9 : Factor w/ 64 levels "1,114.32","1,519.23",..: 27 30 36 41 49 51 50 54 56 64 ...
$ 10: Factor w/ 62 levels "","1,208.85",..: 1 1 1 35 39 40 42 45 46 53 ...
$ 11: Factor w/ 64 levels "","1,089.33",..: 1 11 17 20 23 24 26 29 31 37 ...
$ 12: Factor w/ 62 levels "","1,037.14",..: 1 1 1 22 23 25 31 30 36 41 ...
$ 13: Factor w/ 63 levels "","1,114.20",..: 1 63 1 8 11 12 14 20 22 27 ...
$ 14: Factor w/ 64 levels "1,169.73","1,409.74",..: 63 12 14 16 17 22 24 25 28 30 ...
$ 15: Factor w/ 62 levels "","1,117.66",..: 1 1 1 33 35 39 40 44 43 53 ...
$ 16: Factor w/ 63 levels "","1,045.73",..: 21 1 1 30 35 38 41 42 47 50 ...
$ 17: Factor w/ 62 levels "","1,088.39",..: 1 1 1 24 32 26 34 38 40 48 ...
$ 18: Factor w/ 62 levels "","1,244.71",..: 1 1 1 24 30 31 33 34 38 44 ...
$ 19: Factor w/ 62 levels "","1,155.37",..: 1 1 1 25 34 37 38 41 44 48 ...
$ 20: Factor w/ 64 levels "","1,198.29",..: 1 63 8 11 15 17 18 20 26 30 ...
$ 21: Factor w/ 36 levels "","1,065.67",..: 1 1 1 1 1 1 1 1 1 1 ...
$ 22: Factor w/ 64 levels "1,123.06","1,315.12",..: 12 14 15 17 22 23 24 26 27 40 ...
$ 23: Factor w/ 62 levels "","1,016.31",..: 1 1 1 22 25 31 33 38 43 49 ...
$ 24: Factor w/ 64 levels "1,029.92","1,133.27",..: 52 53 57 60 6 8 9 12 13 22 ...
$ 25: Factor w/ 64 levels "1,222.15","1,517.69",..: 60 62 7 8 12 14 15 21 22 25 ...
$ 26: num NA NA 1.29 1.32 1.36 1.39 1.43 1.62 1.56 1.72 ...
$ 27: Factor w/ 62 levels "","1,036.85",..: 1 1 1 12 16 21 22 27 25 33 ...
$ 28: Factor w/ 61 levels "","1,052.88",..: 1 1 1 12 13 17 18 24 23 26 ...
$ 29: Factor w/ 64 levels "1,018.62","1,081.27",..: 6 7 8 9 10 26 27 34 35 43 ...
$ 30: Factor w/ 62 levels "","1,203.92",..: 1 1 1 6 5 21 22 23 24 32 ...
$ 31: Factor w/ 62 levels "","1,039.85",..: 1 1 1 57 59 9 11 13 14 16 ...
我正在尝试保留所有信息(小数点)并将所有向量转换为数字。到目前为止,我已经尝试将这些向量转换为字符,然后转换为数字,这在SO中提出但是我得到了
> gdp<-data.frame(lapply(gdp,as.character))
> gdp<-data.frame(lapply(gdp,as.numeric))
> str(gdp)
'data.frame': 64 obs. of 31 variables:
$ X1 : num 1 1 1 53 16 20 22 24 30 32 ...
$ X2 : num 42 59 10 13 18 16 17 23 25 35 ...
$ X3 : num 1 1 1 35 36 39 41 42 45 51 ...
$ X4 : num 1 1 1 15 16 21 23 26 27 36 ...
$ X5 : num 1 1 1 11 15 19 17 23 21 27 ...
$ X6 : num 1 1 1 40 45 46 47 52 56 7 ...
$ X7 : num 1 1 1 13 15 19 21 23 24 28 ...
$ X8 : num 1 1 1 26 31 34 35 39 40 47 ...
$ X9 : num 27 30 36 41 49 51 50 54 56 64 ...
$ X10: num 1 1 1 35 39 40 42 45 46 53 ...
$ X11: num 1 11 17 20 23 24 26 29 31 37 ...
$ X12: num 1 1 1 22 23 25 31 30 36 41 ...
$ X13: num 1 63 1 8 11 12 14 20 22 27 ...
$ X14: num 63 12 14 16 17 22 24 25 28 30 ...
$ X15: num 1 1 1 33 35 39 40 44 43 53 ...
$ X16: num 21 1 1 30 35 38 41 42 47 50 ...
$ X17: num 1 1 1 24 32 26 34 38 40 48 ...
$ X18: num 1 1 1 24 30 31 33 34 38 44 ...
$ X19: num 1 1 1 25 34 37 38 41 44 48 ...
$ X20: num 1 63 8 11 15 17 18 20 26 30 ...
$ X21: num 1 1 1 1 1 1 1 1 1 1 ...
$ X22: num 12 14 15 17 22 23 24 26 27 40 ...
$ X23: num 1 1 1 22 25 31 33 38 43 49 ...
$ X24: num 52 53 57 60 6 8 9 12 13 22 ...
$ X25: num 60 62 7 8 12 14 15 21 22 25 ...
$ X26: num NA NA 1 2 3 4 5 7 6 8 ...
$ X27: num 1 1 1 12 16 21 22 27 25 33 ...
$ X28: num 1 1 1 12 13 17 18 24 23 26 ...
$ X29: num 6 7 8 9 10 26 27 34 35 43 ...
$ X30: num 1 1 1 6 5 21 22 23 24 32 ...
$ X31: num 1 1 1 57 59 9 11 13 14 16 ...
不保留所有小数点,也不填写空白作为NA。我也试过了
> gdp<-as.numeric(levels(gdp))[gdp]
Error in as.numeric(levels(gdp))[gdp] : invalid subscript type 'list'
是否有办法将矢量转换为数字?
答案 0 :(得分:0)
让我们打破这个。
首先,因为gdp
是一个数据框,levels
将返回NULL
。您可能正在levels
的每列上查找gdp
的输出。在这种情况下,您需要使用lapply
。
levels(gdp)
# NULL
lapply(gdp, levels)
# this output will make sense
as.numeric(levels(gdp))[gdp]
# this will make no sense
错误表明您无法使用列表(gdp
)来下标向量。
要遍历gdp
的列,您需要使用类似lapply
的内容来处理每个组件。
gdp <- data.frame(lapply(gdp, function(x) {
if(!is.factor(x)) x
else as.numeric(gsub(",","",levels(x),fixed=TRUE))[x]
}))
可能您的数据集最好用作矩阵,因为它似乎都是数字类型。在这种情况下:
gdp <- as.matrix(gdp)