Question

有没有简单的方法可以做到这一点。我有一个包含值和日期的数据框，并想添加一列，其中包含最近一个工作日的值-我在下面手动将其添加为from sklearn.metrics import confusion_matrix conf_mat = confusion_matrix(y_test, y_pred) # please load the data from the pastebin [link above][1] # get the tick label font size fontsize_pt = 10 # plt.rcParams['ytick.labelsize'] does not return a numeric value like suggested [here][2], so I set it to '10' dpi = 72.27 # comput the matrix height in points and inches matrix_height_pt = fontsize_pt * conf_mat.shape[0] matrix_height_in = matrix_height_pt / dpi # compute the required figure height top_margin = 0.04 bottom_margin = 0.04 figure_height = matrix_height_in / (1 - top_margin - bottom_margin) # build the figure instance with the desired height fig, ax = plt.subplots(figsize=(20, figure_height), gridspec_kw=dict(top=(1-top_margin), bottom=bottom_margin)) # here, I turned on annot and changed the color map scheme ax = sns.heatmap(conf_mat, ax=ax, annot=True, cmap="cubehelix")。

val_recent

Answer 1

只需按周末分组，然后取第一个值

df%>%
  group_by(weekday)%>%
  mutate(val2=val[1])
# A tibble: 20 x 5
# Groups:   weekday [7]
   date         val weekday   val_recent  val2
   <chr>      <int> <chr>          <int> <int>
 1 20/01/2018     4 Saturday           4     4
 2 19/01/2018     3 Friday             3     3
 3 18/01/2018     4 Thursday           4     4
 4 17/01/2018    10 Wednesday         10    10
 5 16/01/2018     2 Tuesday            2     2
 6 15/01/2018     6 Monday             6     6
 7 14/01/2018     1 Sunday             1     1
 8 13/01/2018     9 Saturday           4     4
 9 12/01/2018     9 Friday             3     3
10 11/01/2018     7 Thursday           4     4
11 10/01/2018     8 Wednesday         10    10
12 9/01/2018      6 Tuesday            2     2
13 8/01/2018     10 Monday             6     6
14 7/01/2018      6 Sunday             1     1
15 6/01/2018      3 Saturday           4     4
16 5/01/2018      3 Friday             3     3
17 4/01/2018     10 Thursday           4     4
18 3/01/2018      4 Wednesday         10    10
19 2/01/2018      2 Tuesday            2     2
20 1/01/2018      8 Monday             6     6

Answer 2

我们可以使用first中的dplyr

library(dplyr)
df %>%
   group_by(weekday) %>%
   mutate(val2 = first(val))

假定未对“日期”进行排序，请使用which.min

df %>% 
  group_by(weekday) %>% 
  mutate(val2 = val[which.min(date)])

或者用arrange加上“日期”和“工作日”（价格会更高）

df %>%
    arrange(weekday, date) %>%
    group_by(weekday) %>%
     mutate(val2 = first(val))

按工作日的最新值作为新列

2 个答案: