我正在清理一个数据库,该数据库的一列仅包含带有数千个分隔符“,”的数字。我使用sub()
抹掉了“,”,然后使用as.numeric()
将列转换为数字,但是由于引入了NA
而失败。大多数NA最初都是大人物。为什么?我只是想不通。有人可以帮忙吗?万分感谢。代码如下:
> md_1819 <- read.csv(file=file.choose(), header=T)
> str(md_1819)
'data.frame': 60621 obs. of 31 variables:
$ Sales.Group : Factor w/ 26 levels "","D01 ÕÅÓÂ 18822979336",..: 16 17 17 17 17 17 17 17 17 25 ...
$ Ship.To.Party : Factor w/ 1876 levels "","6500000016 ASSAB TOOLING TECHNO",..: 167 1876 259 1876 259 1876 259 1876 259 110 ...
$ Period.year : Factor w/ 13 levels "","001.2019 1. Period 2019",..: 10 7 7 7 7 7 7 7 7 13 ...
$ Time : num 2018 2018 2018 2018 2018 ...
$ Quality : Factor w/ 71 levels "","2556 Uddeholm UHB 11",..: 20 52 52 52 52 52 52 22 22 60 ...
$ Reporting.quantity: Factor w/ 13591 levels "-247.00","-7.52",..: 4 4 4 4 4 4 4 4 4 4 ...
> md_1819$Reporting.quantity <- as.numeric(sub("," , "", md_1819$Reporting.quantity))
Warning message:
NAs introduced by coercion