我有一列如下;
$ kubectl get po (some magic here)
NAME READY STATUS RESTARTS AGE IP NODE BOSHID
fluent-bit-4kmzx 1/1 Running 0 1d ************ fe2be367-a407-4c15-92e7-b0d8918b7e7b cd9179dd-731a-4d01-8541-4e86355d4457
fluent-bit-cg26h 1/1 Running 0 1d ************ 89a7a2dc-7468-4163-90fe-f043e408d6af fec06254-467a-4bdf-983d-f99b7143a667
fluent-bit-ddqzh 1/1 Running 0 1d ************ d4674474-7e0c-49aa-847a-287aa6c1e803 898fff19-3bd5-42d2-8697-0710b0b8baff
sink-controller-57df674b84-mbvcz 1/1 Running 0 1d ************ 89a7a2dc-7468-4163-90fe-f043e408d6af fec06254-467a-4bdf-983d-f99b7143a667
它们对应于月份,即 fiscal_year_end
1 1231
2 1231
3 1231
4 1231
5 202
6 1231
7 1231
8 202
9 1231
10 927
,12-31
和9-27
。
我正在尝试将其设置为这种格式,但似乎无法正确处理。
我已经使用20-2
软件包尝试了str_replace_all(df$fiscal_year_end, "(?<=^\\d{2}|^\\d{4})", "-")
,但并没有如我所愿。
我在哪里错了?
数据:
stringr
编辑:
structure(list(fiscal_year_end = c(1231L, 1231L, 1231L, 1231L,
202L, 1231L, 1231L, 202L, 1231L, 927L, 228L, 1231L, 1231L, 1231L,
1231L, 928L, 1231L, 1231L, 930L, 1231L, 1231L, 628L, 1231L, 1231L,
1228L, 930L, 1231L, 1231L, 1231L, 1231L, 927L, 630L, 1231L, 202L,
1231L, 1231L, 1231L, 1231L, 927L, 930L, 1231L, 1231L, 1231L,
1231L, 228L, 928L, 1231L, 1231L, 1231L, 1231L, 1231L, 1231L,
1231L, 1231L, 1231L, 1231L, 1228L, 1231L, 1231L, 1231L, 1231L,
131L, 1231L, 1231L, 1231L, 1231L, 1231L, 1231L, 930L, 1231L,
1231L, 1231L, 1231L, 1231L, 1231L, 1231L, 831L, 1231L, 102L,
1231L, 1231L, 1231L, 1130L, 1231L, 1228L, 1231L, 1231L, 1231L,
1231L, 1231L, 1231L, 1231L, 1231L, 1231L, 930L, 1031L, 1231L,
1231L, 1231L, 1231L, 1231L, 1231L, 203L, 1231L, 1231L, 1231L,
1231L, 1231L, 1229L, 1231L, 1231L, 1231L, 426L, 1231L, 1231L,
1231L, 1231L, 1231L, 1231L, 1231L, 1231L, 1231L, 202L, 1231L,
1231L, 1231L, 1231L, 1231L, 1231L, 1229L, 1231L, 1231L, 630L,
1231L, 1231L, 1209L, 1231L, 1231L, 1231L, 728L, 1231L, 1231L,
1231L, 1231L, 1231L, 1231L, 630L, 1231L, 1231L, 1231L, 1231L,
1231L, 1231L, 727L, 1231L, 201L, 1231L, 1231L, 1231L, 1231L,
1231L, 630L, 1231L, 1231L, 1231L, 1130L, 1231L, 1231L, 1231L,
1231L, 1231L, 1231L, 1231L, 930L, 930L, 1231L, 1231L, 331L, 1231L,
1231L, 1231L, 1231L, 1231L, 1231L, 1231L, 1031L, 1229L, 1231L,
1231L, 1231L, 201L, 1231L, 1231L, 1231L, 1231L, 1231L, 1231L,
831L, 630L, 831L)), row.names = c(NA, -200L), .internal.selfref = <pointer: 0x0000000002511ef0>, class = "data.frame")
答案 0 :(得分:2)
格式化为4位数字后,我们可以separate
library(dplyr)
library(tidyr)
df1 %>%
mutate(fiscal_year_end = sprintf("%04d", fiscal_year_end)) %>%
separate(fiscal_year_end, c("month", "day"), sep= 2)
或在separate
中使用负索引
df1 %>%
separate(fiscal_year_end, c("month", "day"), sep= -2)
或仅使用 base R
,我们使用sub
创建分隔符(仅使用单个捕获组)并将其转换为两列data.frame,并使用{{ 1}}
read.csv
答案 1 :(得分:2)
使用基数R,我们可以将sub
与两个捕获组一起使用,其中第二部分是带有两位数的数字,而第一部分是其他所有内容。
sub("(.*)(\\d+{2}$)", "\\1-\\2", df$fiscal_year_end)
#[1] "12-31" "12-31" "12-31" "12-31" "2-02" "12-31" "12-31" "2-02" "12-31"
# "9-27" "2-28" "12-31" .....
答案 2 :(得分:2)
另一种过于复杂的方式:
res1<-ifelse(nchar(my_df$fiscal_year_end)%%2==0,substring(my_df$fiscal_year_end,1,2),
substring(my_df$fiscal_year_end,1,1))
res2<-ifelse(nchar(my_df$fiscal_year_end)%%2==0,substring(my_df$fiscal_year_end,3,4),
substring(my_df$fiscal_year_end,2,3))
paste0(res1,"-",res2)
结果:
[1] "12-31" "12-31" "12-31" "12-31" "2-02" "12-31" "12-31" "2-02" "12-31" "9-27"