将周一,周二,周三,周四和周五的每周列重新排列为每周的每日数据系列

时间:2020-02-16 15:43:05

标签: r reshape

我有一个包含工作日(不包括周末)的每日时间序列。 我想对其重新排序,以使每列代表一周,随后的五行显示该周的星期一,星期二,星期三,星期四和星期五的数据。

我尝试使用cast(软件包:reshape),但是在获取上述内容时遇到了问题。

谢谢您的帮助。

示例:

Date        Day Value
06/01/2020  mon 15
07/01/2020  tue 16
08/01/2020  wed 17
09/01/2020  thu 18
10/01/2020  fri 19
13/01/2020  mon 20
14/01/2020  tue 21
15/01/2020  wed 22
16/01/2020  thu 23
17/01/2020  fri 24

要重新设置为:

Start of week   mon tue wed thu fri
06/01/2020      15  16  17  18  19
13/01/2020      20  21  22  23  24

2 个答案:

答案 0 :(得分:2)

以下是使用lubridate包的示例。我还使用Datepivot_wider从字符转换为日期类。

关键是要使用tidyr包中的library(tidyverse) library(lubridate) dat2 <- dat %>% # Convert W to factor for ordering mutate(Day = factor(Day, levels = c("mon", "tue", "wed", "thu", "fri"))) %>% # Create a goruping variable to show the week number group_by(Day) %>% mutate(Group = 1:n()) %>% ungroup() %>% # Change the Date based on Group group_by(Group) %>% mutate(Date = min(dmy(Date))) %>% # Convert to wide format pivot_wider(names_from = Day, values_from = Value) %>% # Remove Group ungroup() %>% select(-Group) dat2 # # A tibble: 2 x 6 # Date mon tue wed thu fri # <date> <int> <int> <int> <int> <int> # 1 2020-01-06 15 16 17 18 19 # 2 2020-01-13 20 21 22 23 24 将数据转换为宽格式。

# Create example data frame
dat <- read.table(text = "Date Day Value

'06/01/2020' mon 15

'07/01/2020' tue 16

'08/01/2020' wed 17

'09/01/2020' thu 18

'10/01/2020' fri 19

'13/01/2020' mon 20

'14/01/2020' tue 21

'15/01/2020' wed 22

'16/01/2020' thu 23

'17/01/2020' fri 24",
                  header = TRUE, stringsAsFactors = FALSE)

数据

    function test(e, t, n, r) {
        console.log("INPUT = " + e, t, n, r);
        for (var i, o = 0; e < r; e++) {
            if (o > n.length - 1 && (o = 0), e < r) {
                if ((i = t[e] ^ n.charCodeAt(o)) === t[e]) {
                    console.debug("Error decoding sample preview", t[e], n[o], o, n.length, n.charCodeAt(o));
                    break
                }
                t[e] = i
            }
            o++
        }
        console.log("OUTPUT = " + e, t, n, r);
        return e
    }

-- CONSOLE LOG --
    INPUT = 0 
    Uint8Array(202710) [206, 159, 164, 117, 53, 55, 51, 165, 41, 229, 254, 16, 113, 47, 55, 181, 50, 67, 149, 232, 52, 122, 205, 82, 123, 46, 153, 176, 98, 65, 35, 67, 149, 42, 168, 50, 49, 118, 103, 161, 37, 35, 17, 189, 28, 104, 207, 169, 134, 45, 98, 52, 57, 183, 182, 227, 179, 177, 53, 55, 49, 54, 170, 183, 228, 187, 49, 60, 157, 132, 200, 192, 193, 148, 52, 49, 53, 51, 80, 216, 210, 71, 156, 203, 206, 193, 98, 52, 127, 223, 206, 155, 244, 49, 53, 38, 182, 176, 162, 196, …]
     1d415717-0c41-b480 50677

    OUTPUT = 50677 
    Uint8Array(202710) [255, 251, 144, 68, 0, 0, 2, 146, 4, 213, 157, 36, 64, 2, 85, 129, 10, 115, 164, 140, 0, 75, 248, 101, 74, 25, 180, 128, 1, 117, 18, 110, 247, 30, 144, 2, 0, 18, 83, 144, 16, 20, 32, 138, 49, 88, 172, 157, 183, 0, 0, 0, 1, 135, 135, 135, 135, 128, 0, 0, 0, 1, 135, 135, 135, 143, 0, 17, 255, 176, 240, 240, 240, 240, 0, 0, 0, 4, 97, 239, 255, 119, 255, 255, 255, 236, 0, 0, 71, 239, 255, 255, 192, 0, 0, 17, 135, 135, 143, 244, …]
     1d415717-0c41-b480 50677

答案 1 :(得分:2)

带有-软件包的另一个选项:

library(data.table)

# convert to a 'data.table'
# set the 'Date' and 'Day' columns in the right format
setDT(mydf)[, `:=` (Date = as.Date(Date, format = "%d/%m/%Y"),
                    Day = factor(Day, levels = c("mon","tue","wed","thu","fri")))]

# create a 'start_of_week' column
# transform from long to wide format
res <- mydf[, start_of_week := Date[1], by = cumsum(Day == "mon")
            ][, dcast(.SD, start_of_week ~ Day, value.var = "Value")]

给出:

> res
   start_of_week mon tue wed thu fri
1:    06/01/2020  15  16  17  18  19
2:    13/01/2020  20  21  22  23  24

使用的数据:

mydf <- read.table(text="Date        Day Value
06/01/2020  mon 15
07/01/2020  tue 16
08/01/2020  wed 17
09/01/2020  thu 18
10/01/2020  fri 19
13/01/2020  mon 20
14/01/2020  tue 21
15/01/2020  wed 22
16/01/2020  thu 23
17/01/2020  fri 24", header=TRUE, stringsAsFactors=FALSE)