我在R中有49个级别的因子,我试图使用as.numeric转换为数字
我希望将North指定转换为" +"和南方" - "所以数据看起来像
我不确定如何超越
Mcity$lat <- as.numeric(Mcity$Latitude)
structure(c(40L, 40L, 40L, 40L), .Label =
c("0.80N", "0.80S", "10.45N", "12.05N", "12.05S", "13.66N", "13.66S",
"15.27N", "15.27S", "16.87N", "18.48N", "18.48S", "2.41N", "20.09N",
"20.09S", "21.70N", "23.31N", "23.31S", "24.92N", "26.52N", "28.13N",
"29.74N", "29.74S", "31.35N", "32.95N", "32.95S", "34.56N", "34.56S",
"36.17N", "37.78N", "37.78S", "39.38N", "4.02N", "4.02S", "40.99N", "42.59N",
"44.20N", "45.81N", "49.03N", "5.63N", "5.63S", "50.63N", "52.24N", "55.45N",
"60.27N", "7.23N", "7.23S", "8.84N", "8.84S"), class = "factor")
答案 0 :(得分:2)
这应该有效:
Mcity$lat <- (1 - 2 * grepl("S", Mcity$Latitude)) * as.numeric(gsub("N|S", "", Mcity$Latitude))
如果找到S则更改数字部分的符号。
答案 1 :(得分:1)
您可以使用stringr
来删除最后一个字符,然后使用dplyr
作为重新组合的选项,我使用case_when
来提供额外的错误处理,但ifelse
已经足够了。
library(dplyr)
library(stringr)
fct_list <- factor(
c(
"0.80N", "0.80S", "10.45N", "12.05N", "12.05S", "13.66N", "13.66S",
"15.27N", "15.27S", "16.87N", "18.48N", "18.48S", "2.41N", "20.09N",
"20.09S", "21.70N", "23.31N", "23.31S", "24.92N", "26.52N", "28.13N",
"29.74N", "29.74S", "31.35N", "32.95N", "32.95S", "34.56N", "34.56S",
"36.17N", "37.78N", "37.78S", "39.38N", "4.02N", "4.02S", "40.99N",
"42.59N", "44.20N", "45.81N", "49.03N", "5.63N", "5.63S", "50.63N",
"52.24N", "55.45N", "60.27N", "7.23N", "7.23S"
)
)
# note that factors are often no fun, so I've converted to character here
string <- as.character(fct_list)
case_when(
str_sub(string, -1, -1) == "N" ~ as.numeric(str_sub(string, 1, nchar(string) - 1)),
str_sub(string, -1, -1) == "S" ~ -as.numeric(str_sub(string, 1, nchar(string) - 1)),
TRUE ~ NA_real_
)
# [1] 0.80 -0.80 10.45 12.05 -12.05 13.66 -13.66 15.27
# [9] -15.27 16.87 18.48 -18.48 2.41 20.09 -20.09 21.70
# [17] 23.31 -23.31 24.92 26.52 28.13 29.74 -29.74 31.35
# [25] 32.95 -32.95 34.56 -34.56 36.17 37.78 -37.78 39.38
# [33] 4.02 -4.02 40.99 42.59 44.20 45.81 49.03 5.63
# [41] -5.63 50.63 52.24 55.45 60.27 7.23 -7.23
比来自 BenoitLondon 的正则表达式解决方案更加冗长,但我倾向于在探索性工作中倾向于简洁而不简洁。
答案 2 :(得分:0)
ifelse的另一种选择可能如下:
lat <- c("0.80N", "0.80S", "10.45N", "12.05S", "12.05S")
lat <- as.character(lat)
## use of substr function inside an ifelse function
lat2 <- ifelse(substr(lat,nchar(lat),nchar(lat)) == 'N',
as.numeric(substr(lat,1,(nchar(lat)-1))),
-as.numeric(substr(lat,1,(nchar(lat)-1))))