将因子转换为整数

时间:2019-03-31 20:39:36

标签: r replace

我正在使用工资变量分析一些数据。该变量包含符号“€”和“ M”或“ K”。

我试图使用gsub()函数解决此问题,但我的代码不起作用

Integer_converter <- function(strWage) { 
  Factor_Wage = gsub("€", " ", strWage)
}

Factor_converter_1 <- function(strWage) {
  Integer_Wage = gsub("M", " ", strWage)
}

Factor_converter_2 <- function(strWage) {
  Integer_wage = as.integer(as.integer(gsub("K", "", strWage)) / 100) 
}

实际值如下:

$工资/ fct /€405K,€195K,€205K,€240K,€175K,€25K,€205K,€57K,€140K,€135K,€15K,€45K,€40K,€76K, 1.7万欧元,12.5万欧元,...

,我想将其转换为

$ Wage / int / 0.405,0.195,0.205,0.240,0.175,0.025,0.205,0.057,0.140,0.135,0.015,0.045,0.040,0.076,0.017,0.125,…enter image description here

1 个答案:

答案 0 :(得分:1)

我们可以使用parse_number中的readr提取数字并除以1000。

library(readr)
parse_number(as.character(df1$Wage))/1000
#[1] 0.405 0.195 0.205 0.240 0.175 0.025 0.205 0.057 0.140 
#[10] 0.135 0.015 0.045 0.040 0.076 0.017 0.125

提取数字部分,然后除以1000


也可以使用tidyverse

library(dplyr)
df1 %>%
   mutate(Wage = parse_number(as.character(Wage))/1000)

如果除“ K”之外还有“ M”,我们可以使用gsubfn

library(gsubfn)
unname(sapply(gsubfn("[A-Z]", list(K = '/1e3', M = '/1e6'), 
       sub("€", "", df2$Wage)), function(x) eval(parse(text = x))))

数据

df1 <- data.frame(Wage = c("€405K", "€195K", "€205K", "€240K", "€175K",
  "€25K", "€205K", "€57K",  "€140K", "€135K", "€15K", "€45K",
     "€40K", "€76K", "€17K", "€125K"))

df2 <- data.frame(Wage = c("€405K", "€195K", "€205K", "€240K", "€175K",
  "€25K", "€205K", "€57K",  "€140K", "€135K", "€15M", "€45K",
     "€40K", "€76K", "€17M", "€125K"))