我的数据框如下:
> apply(theDF, 2, mode)
ID Ticker INDUSTRY_SECTOR VAR CVAR
"character" "character" "character" "character" "character"
我们可以看到这些都是字符列:
> apply(theDF, 2, as.numeric)
ID Ticker INDUSTRY_SECTOR VAR CVAR
[1,] 1 NA NA 0.00 0.000
[2,] 2 NA NA -181412.82 -301731.228
[3,] 3 NA NA 61711.95 102641.163
[4,] 4 NA NA 1095.16 1821.503
[5,] 5 NA NA 16498.22 27440.332
我想要的东西只会将数字类型向量更改为数字。基本上,如果它看起来像"一个数字,使其成为数字,否则留下它。我在StackOverflow上找不到任何不需要知道你想要转换的名称或列的东西。这个DF并不总是按相同的顺序排列,或者有列,所以我需要一些动态的方法来检查列和#34;看起来像#34;数字并使这些列数字。
这(显然)给了我一堆NA; s代表字符列:
> apply(theDF, 2, function(x) tryCatch(as.numeric(x),error=function(e) e, warning=function(w) x))
ID Ticker INDUSTRY_SECTOR VAR CVAR
[1,] "1" "USD CASH" "" "0" "0"
[2,] "2" "ZAR CASH" "" "-181412.82055904" "-301731.22832191"
[3,] "3" "BAT SJ EQUITY" "Financial" "61711.951234826" "102641.162795691"
[4,] "4" "HCI SJ EQUITY" "Financial" "1095.16002541256" "1821.50290513369"
[5,] "5" "PSG SJ EQUITY" "Financial" "16498.2192382422" "27440.331617902"
我尝试过这样的事情,但它不仅不起作用,而且似乎非常低效:
> apply(theDF, 2, mode)
ID Ticker INDUSTRY_SECTOR VAR CVAR
"character" "character" "character" "character" "character"
> sapply(theDF, mode)
ID Ticker INDUSTRY_SECTOR VAR CVAR
"character" "character" "character" "character" "character"
> apply(theDF, 2, class)
ID Ticker INDUSTRY_SECTOR VAR CVAR
"character" "character" "character" "character" "character"
> sapply(theDF, class)
ID Ticker INDUSTRY_SECTOR VAR CVAR
"character" "character" "character" "character" "character"
有更好的方法吗?
编辑: 人们不断要求这样做,所以这里...... ...
testRemove/1
答案 0 :(得分:13)
看起来像type.convert()
的工作。
theDF[] <- lapply(theDF, type.convert, as.is = TRUE)
## check the result
sapply(theDF, class)
# ID Ticker INDUSTRY_SECTOR VAR CVAR
# "integer" "character" "character" "numeric" "numeric"
type.convert()
将矢量强制转换为“最合适”的类型。设置as.is = TRUE
可以让我们保留字符,否则会被强制转换为因素。
更新:对于非字符的列,需要先将其强制转换为字符。
theDF[] <- lapply(theDF, function(x) type.convert(as.character(x), as.is = TRUE))