我有以下虚拟数据帧:
structure(list(id = 1:10, dates = c("2018-07-02, 2018-06-28",
"2018-08-22", "2018-08-06, 2018-07-31", "2018-03-08", "2018-02-22, 2018-02-19",
"2018-07-04, 2018-07-06", "2018-06-26, 2018-06-22", "2018-01-18, 2018-01-24",
"2018-06-05, 2018-06-14", "2018-01-18")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
我想将“日期”列中的所有条目转换为日期,然后选择最新的条目,并删除该单元格中的所有其他日期。
我尝试了以下方法:
library(dplyr)
library(reprex)
library(purrr)
library(lubridate)
test_df %>%
mutate(dates = dates %>%
str_extract_all("[0-9]+-[0-9]+-[0-9]+") %>%
map(ymd) %>%
map_lgl(~ any(max(.))))
但是以某种方式,这会将每个单元格中的所有条目转换为数字,而不是正确的日期。
我最后想要得到的东西:
id dates
1 2018-07-02
2 2018-08-22
3 2018-08-06
4 2018-03-08
5 2018-02-22
6 2018-07-06
7 2018-06-26
8 2018-01-24
9 2018-06-14
10 2018-01-18
答案 0 :(得分:2)
scan
在字段中,取最大并转换为Date
类。
library(dplyr)
scan_max <- function(x) {
max(scan(text = x, what = "", sep = ",", quiet = TRUE, strip.white = TRUE))
}
test_df %>%
mutate(dates = as.Date(sapply(dates, scan_max)))
给予:
# A tibble: 10 x 2
id dates
<int> <date>
1 1 2018-07-02
2 2 2018-08-22
3 3 2018-08-06
4 4 2018-03-08
5 5 2018-02-22
6 6 2018-07-06
7 7 2018-06-26
8 8 2018-01-24
9 9 2018-06-14
10 10 2018-01-18
它也可以这样写:
scan_max <- . %>%
scan(text = ., what = "", sep = ",", quiet = TRUE, strip.white = TRUE) %>%
max
test_df %>%
mutate(dates = dates %>% sapply(scan_max) %>% as.Date)
答案 1 :(得分:1)
您可以尝试:
filename = f'{self.FILEPATH}{box_code}_{datetime.now().strftime("%d-%m-%Y")}.txt'
with open(filename, 'a') as out:
out.write('text_text' + '\n')
out.close()
答案 2 :(得分:1)
我使用三个突变:
然后就是这个了
df <- structure(list(id = 1:10, dates = c("2018-07-02, 2018-06-28",
"2018-08-22", "2018-08-06, 2018-07-31", "2018-03-08", "2018-02-22, 2018-02-19",
"2018-07-04, 2018-07-06", "2018-06-26, 2018-06-22", "2018-01-18, 2018-01-24",
"2018-06-05, 2018-06-14", "2018-01-18")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
library(tidyr)
library(stringi)
library(dplyr)
df_new <- df %>%
mutate(dates = stri_split_fixed(dates, ", ")) %>%
mutate(dates = rapply(dates, as.Date, how = "list")) %>%
mutate(dates = lapply(dates, function(x) {
sort(x, decreasing = TRUE)[1]
})) %>%
unnest(dates)
> df_new
# A tibble: 10 x 2
id dates
<int> <date>
1 1 2018-07-02
2 2 2018-08-22
3 3 2018-08-06
4 4 2018-03-08
5 5 2018-02-22
6 6 2018-07-06
7 7 2018-06-26
8 8 2018-01-24
9 9 2018-06-14
10 10 2018-01-18
另一个带有map的选项,而不是两个apply
:
library(tidyr)
library(stringi)
library(dplyr)
library(purrr)
df_new <- df %>%
mutate(dates = stri_split_fixed(dates, ", ")) %>%
mutate(dates = map(dates, function(x) {
x <- as.Date(x)
sort(x, decreasing = TRUE)[1]
})) %>%
unnest(dates)
df_new