>str(data$Installs)
$安装次数:因子w / 21级“”,“ 0 +”,“ 1 +”,“ 1,000 +”,..:8 20 15 18 11 17 17 5 5 8 ...
db$Installs = as.character(gsub("\\+", "", db$Installs))
str(db$Installs)
chr [1:10841] "10,000" "500,000" "5,000,000" "50,000,000" "100,000" "50,000" "50,000" "1,000,000" "1,000,000" "10,000" ...
db$Installs = as.double(gsub(",","",db$Installs))
str(db$Installs)
num [1:10841] 1e+04 5e+05 5e+06 5e+07 1e+05 5e+04 5e+04 1e+06 1e+06 1e+04 ...
我想要这样的变量:
“ 10000”“ 500000”“ 5000000”“ 50000000”“ 100000”“ 50000”“ 50000”“ 1000000”“ 1000000”“ 10000” ...
db$Installs.factor <- factor(db$Installs)
db$Installs = as.character(gsub("\\+", "", db$Installs))
db$Installs = as.double(gsub(",","",db$Installs))
答案 0 :(得分:1)
尝试一下
输入-
sample <- c("10,000+" ,"500,000+", "5,000,000+", "50,000,000+" ,"100,000+", "50,000+" ,"50,000+" ,"1,000,000+" )
解决方案-
sample <- as.numeric(gsub("\\D", "", sample))
输出-
1] 10000 500000 5000000 50000000 100000 50000 50000 1000000
注意-如果要强制R不使用指数表示法,则可以使用-
options("scipen"=100, "digits"=4)
“ 密码”:整数。决定以固定或指数形式打印数字值时要施加的罚款。正值偏向固定值,负值偏向科学记数法:除非固定记号的位数大于“科学”位数,否则将是首选。