答案 0 :(得分:1)
您可能需要找到df <- data.frame(group = c("a", "a", "b"),
start = as.Date(c("2018-01-01", "2018-09-01", "2018-02-01")),
end = as.Date(c("2018-02-15", "2018-12-31", "2018-03-30")))
library(tidyverse)
library(lubridate)
df %>%
group_by(id = row_number()) %>% # for each row
mutate(seq = list(seq(start, end, "day")), # create a sequence of dates with 1 day step
month = map(seq, month)) %>% # get the month for each one of those dates in sequence
unnest() %>% # unnest data
group_by(group, id, month) %>% # for each group, row and month
summarise(start = min(seq), # get minimum date as start
end = max(seq)) %>% # get maximum date as end
ungroup() %>% # ungroup
select(-id, - month) # remove unecessary columns
# # A tibble: 8 x 3
# group start end
# <fct> <date> <date>
# 1 a 2018-01-01 2018-01-31
# 2 a 2018-02-01 2018-02-15
# 3 a 2018-09-01 2018-09-30
# 4 a 2018-10-01 2018-10-31
# 5 a 2018-11-01 2018-11-30
# 6 a 2018-12-01 2018-12-31
# 7 b 2018-02-01 2018-02-28
# 8 b 2018-03-01 2018-03-30
lookup
答案 1 :(得分:0)
如果您知道索引中的所有年份也在您的列中,则可以尝试以下操作:
import pandas as pd
import numpy as np
df = pd.DataFrame(
{2008: 5*[1], 2009: 5*[2], 2010: 5*[3], 2011: 5*[4], 2012: 5*[5]},
index=[2008, 2011, 2010, 2009, 2012]
)
np.diag(df.loc[df.index, df.index])
结果:
array([1, 4, 3, 2, 5])