我正在使用工资变量分析一些数据。该变量包含符号“€”和“ M”或“ K”。
我试图使用gsub()函数解决此问题,但我的代码不起作用
Integer_converter <- function(strWage) {
Factor_Wage = gsub("€", " ", strWage)
}
Factor_converter_1 <- function(strWage) {
Integer_Wage = gsub("M", " ", strWage)
}
Factor_converter_2 <- function(strWage) {
Integer_wage = as.integer(as.integer(gsub("K", "", strWage)) / 100)
}
实际值如下:
$工资/ fct /€405K,€195K,€205K,€240K,€175K,€25K,€205K,€57K,€140K,€135K,€15K,€45K,€40K,€76K, 1.7万欧元,12.5万欧元,...
,我想将其转换为
$ Wage / int / 0.405,0.195,0.205,0.240,0.175,0.025,0.205,0.057,0.140,0.135,0.015,0.045,0.040,0.076,0.017,0.125,…enter image description here
答案 0 :(得分:1)
我们可以使用parse_number
中的readr
提取数字并除以1000。
library(readr)
parse_number(as.character(df1$Wage))/1000
#[1] 0.405 0.195 0.205 0.240 0.175 0.025 0.205 0.057 0.140
#[10] 0.135 0.015 0.045 0.040 0.076 0.017 0.125
提取数字部分,然后除以1000
也可以使用tidyverse
链
library(dplyr)
df1 %>%
mutate(Wage = parse_number(as.character(Wage))/1000)
如果除“ K”之外还有“ M”,我们可以使用gsubfn
library(gsubfn)
unname(sapply(gsubfn("[A-Z]", list(K = '/1e3', M = '/1e6'),
sub("€", "", df2$Wage)), function(x) eval(parse(text = x))))
df1 <- data.frame(Wage = c("€405K", "€195K", "€205K", "€240K", "€175K",
"€25K", "€205K", "€57K", "€140K", "€135K", "€15K", "€45K",
"€40K", "€76K", "€17K", "€125K"))
df2 <- data.frame(Wage = c("€405K", "€195K", "€205K", "€240K", "€175K",
"€25K", "€205K", "€57K", "€140K", "€135K", "€15M", "€45K",
"€40K", "€76K", "€17M", "€125K"))