在组之间插入空白行并保持原始顺序

时间:2019-01-22 15:19:28

标签: r

我基本上是在做this question中描述的操作,但是我正在尝试保持month列的原始顺序。一种回旋方式是将一个前导零添加到个位数。我正试图找到一种方法,而无需这样做。

当前代码:

library(dplyr)
df <- structure(list(parameters = c("temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp", "temp"), month = c("2", "2", "2", "5", "5", "5", "8", "8", "8", "11", "11", "11", "annual", "annual", "annual")), class = "data.frame", row.names = c(NA, -15L))
do.call(bind_rows, by(df, df[ ,c("month", "parameters")], rbind, ""))

by函数看起来像将您定义的索引强制转换为因子,而将month转换为因子则表明它按以下顺序构成了水平:11, 2, 5, 8, annual。如果它是数字,则可以正确地对它们进行排序,但是如果包含annual,则该列必须作为字符。

如果我先将其转换为因子,然后对水平进行排序,则我的代码将插入NA

df$month <- ordered(df$month, levels = c("2", "5", "8", "11", "annual"))
do.call(bind_rows, by(df, df[ ,c("month", "parameters")], rbind, ""))

当前输出:

   parameters  month
1        temp     11
2        temp     11
3        temp     11
4                   
5        temp      2
6        temp      2
7        temp      2
8                   
9        temp      5
10       temp      5
11       temp      5
12                  
13       temp      8
14       temp      8
15       temp      8
16                  
17       temp annual
18       temp annual
19       temp annual
20                  

所需的输出:

   parameters month
1        temp     2
2        temp     2
3        temp     2
4                  
5        temp     5
6        temp     5
7        temp     5
8                  
9        temp     8
10       temp     8
11       temp     8
12                 
13       temp    11
14       temp    11
15       temp    11
16                 
17       temp annual
18       temp annual
19       temp annual
20                  

2 个答案:

答案 0 :(得分:1)

问题在于,在“月份”列更改为ordered因子之后,""未指定为levels之一。因此,自然地,任何非level的值都将被视为缺失值,因此我们得到了NA。可以通过将""作为levels

之一来进行更正。
df$month <- ordered(df$month, levels = c("2", "5", "8", "11", "annual", ""))

注意:order中的""不清楚。因此,它被指定为最后一个level

答案 1 :(得分:1)

还有另一种方法,它使用rbind()函数的化身在每组之后添加一个空白行:

library(data.table)
setDT(df)[, rbind(.SD, data.table(parameters = "")), by = month]
     month parameters
 1:      2       temp
 2:      2       temp
 3:      2       temp
 4:      2           
 5:      5       temp
 6:      5       temp
 7:      5       temp
 8:      5           
 9:      8       temp
10:      8       temp
11:      8       temp
12:      8           
13:     11       temp
14:     11       temp
15:     11       temp
16:     11           
17: annual       temp
18: annual       temp
19: annual       temp
20: annual

保留组的顺序。分组变量month出现在每行的前面。如果需要,也可以使用这种方法来散布任意数量的空白行:

n_blank <- 2L
setDT(df)[, rbind(.SD, data.table(parameters = rep("", n_blank))), by = month]
     month parameters
 1:      2       temp
 2:      2       temp
 3:      2       temp
 4:      2           
 5:      2           
 6:      5       temp
 7:      5       temp
 8:      5       temp
 9:      5           
10:      5           
11:      8       temp
12:      8       temp
13:      8       temp
14:      8           
15:      8           
16:     11       temp
17:     11       temp
18:     11       temp
19:     11           
20:     11           
21: annual       temp
22: annual       temp
23: annual       temp
24: annual           
25: annual           
     month parameters