我有一个带有德国日期的变量,并希望将其转换为日期变量,以便以后过滤出年度季度。
像这样:
Date newDate quarter
1 21. Mrz 10 <NA> <NA>
2 21. Jan 10 2010-01-21 2010 Q1
3 30. Mrz 10 <NA> <NA>
4 21. Mrz 10 <NA> <NA>
5 21. Jan 10 2010-01-21 2010 Q1
不幸的是,R无法识别3月的德国月份缩写,例如“ Mrz”。
我已经尝试将语言更改为德语,但这没有帮助。
Sys.setlocale(category = "LC_TIME", locale="de_DE.UTF-8")
[1] "de_DE.UTF-8"
alldata_LOR_BZ$newErstesAngebot = as.Date(as.character(alldata_LOR_BZ$newErstesAngebot), "%d. %b %y")
library(zoo)
Dateproblem$quarter <- as.yearqtr(Dateproblem$newDate)
SeesionInfo
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks /vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/de_DE.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reprex_0.2.0 tidyr_0.8.1 zoo_1.8-3 foreign_0.8-71 car_3.0-0
[6] carData_3.0-1 gplots_3.0.1 plm_1.6-6 Formula_1.2-3 dplyr_0.7.6
[11] ggplot2_3.0.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 bdsmatrix_1.3-3 lattice_0.20-35 gtools_3.8.1
[5] assertthat_0.2.0 rprojroot_1.3-2 digest_0.6.15 lmtest_0.9-36
[9] R6_2.2.2 cellranger_1.1.0 plyr_1.8.4 backports_1.1.2
[13] evaluate_0.11 pillar_1.3.0 miscTools_0.6-22 rlang_0.2.1
[17] lazyeval_0.2.1 curl_3.2 readxl_1.1.0 data.table_1.11.4
[21] gdata_2.18.0 whisker_0.3-2 callr_2.0.4 rmarkdown_1.10
[25] stringr_1.3.1 munsell_0.5.0 compiler_3.5.1 pkgconfig_2.0.1
[29] clipr_0.4.1 maxLik_1.3-4 htmltools_0.3.6 tidyselect_0.2.4
[33] tibble_1.4.2 rio_0.5.10 crayon_1.3.4 withr_2.1.2
[37] MASS_7.3-50 bitops_1.0-6 grid_3.5.1 nlme_3.1-137
[41] gtable_0.2.0 magrittr_1.5 scales_0.5.0 KernSmooth_2.23-15
[45] zip_1.0.0 stringi_1.2.4 bindrcpp_0.2.2 sandwich_2.4-0
[49] openxlsx_4.1.0 tools_3.5.1 forcats_0.3.0 glue_1.3.0
[53] purrr_0.2.5 hms_0.4.2 processx_3.1.0 abind_1.4-5
[57] yaml_2.2.0 colorspace_1.3-2 caTools_1.17.1.1 knitr_1.20
[61] bindr_0.1.1 haven_1.1.2
现在我注意到语言更改似乎没有效果... 是Sys.setlocale(类别=“ LC_TIME”,locale =“ de_DE.UTF-8”) 不是正确的命令?
答案 0 :(得分:1)
根据官方的DIN 1355德语3字母月的缩写,编写自己的矢量化解析器并不难。
# 3-letter months abbreviations DIN 1355
months <- c(
"Jan", "Feb", "Mrz", "Apr",
"Mai", "Jun", "Jul", "Aug",
"Sep", "Okt", "Nov", "Dez")
# Custom function to parse German dates DD. MMM YY
parse.de.date <- function(x) {
as.Date(
sapply(x, function(t) {
dmy <- unlist(strsplit(gsub("\\.", "", t), "\\s"))
paste(dmy[1], match(dmy[2], months), dmy[3], sep = "-")
}),
format = "%d-%m-%y")
}
library(dplyr)
df %>%
mutate(Date = parse.de.date(Date))
# Date newDate quarter
#1 2010-03-21 <NA> <NA>
#2 2010-01-21 2010-01-21 2010 Q1
#3 2010-03-30 <NA> <NA>
#4 2010-03-21 <NA> <NA>
#5 2010-01-21 2010-01-21 2010 Q1
df <- read.table(text =
" Date newDate quarter
1 '21. Mrz 10' <NA> <NA>
2 '21. Jan 10' 2010-01-21 '2010 Q1'
3 '30. Mrz 10' <NA> <NA>
4 '21. Mrz 10' <NA> <NA>
5 '21. Jan 10' 2010-01-21 '2010 Q1'", header = T)
答案 1 :(得分:0)
无需编写任何内容,readr
软件包可以为您完成所有工作,只需定义缩写的月份名称即可:
# example data:
dates <- c("21. Mrz 10",
"21. Jan 10",
"30. Mrz 10",
"21. Mrz 10",
"21. Jan 10")
# load library
library(readr)
# get the default german locale
my_format <- date_names_lang("de")
# change the abbrevated month names
my_format$mon_ab <- c("Jan", "Feb", "Mrz", "Apr", "Mai", "Jun", "Jul", "Aug", "Sep", "Okt", "Nov", "Dez")
# parse using your format
parse_date(dates, format="%d. %b %y", locale=locale(date_names = my_format))