我想将每个月的最后两列复制到下个月的开始。我按如下所述进行操作,但是数据包含NA,当我将其更改为字符时,程序将崩溃。如何复制列以保持其类型?
我的代码:
library(readxl)
library(tibble)
df<- read_excel("C:/Users/Rezerwa/Documents/Database.xlsx")
df=add_column(df, Feb1 = as.character(do.call(paste0, df["January...4"])), .after = "January...5")
df=add_column(df, Feb2 = as.numeric(do.call(paste0, df["January...5"])), .after = "Feb1")
我的数据:
df
# A tibble: 10 x 13
Product January...2 January...3 January...4 January...5 February...6 February...7 February...8 February...9 March...10 March...11 March...12 March...13
<chr> <lgl> <lgl> <chr> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl>
1 a NA NA 754.00 4 754.00 4 754.00 4 754.00 4 754.00 4
2 b NA NA 706.00 3 706.00 3 706.00 3 706.00 3 706.00 3
3 c NA NA 517.00 3 517.00 3 517.00 3 517.00 3 517.00 3
4 d NA NA 1466.00 9 1466.00 9 1466.00 9 1466.00 9 1466.00 9
5 e NA NA 543.00 8 543.00 8 543.00 8 543.00 8 543.00 8
6 f NA NA NA NA NA NA NA NA NA NA NA NA
7 g NA NA NA NA NA NA NA NA NA NA NA NA
8 h NA NA NA NA NA NA NA NA NA NA NA NA
9 i NA NA 1466.00 8 NA NA NA NA NA NA NA NA
10 j NA NA NA NA 543.00 3 NA NA NA NA NA NA
我的错误:
> df=add_column(df, Feb1 = as.character(do.call(paste0, df["January...4"])), .after = "January...5")
> df=add_column(df, Feb2 = as.numeric(do.call(paste0, df["January...5"])), .after = "Feb1")
Warning message:
In eval_tidy(xs[[i]], unique_output) : NAs introduced by coercion
答案 0 :(得分:1)
使用基数R,我们可以根据列名的前缀对列进行拆分,从每个组中选择最后两列,并从cbind
到原始df
。
df1 <- cbind(df, do.call(cbind, lapply(split.default(df[-1],
sub("\\..*", "", names(df)[-1])), function(x) {n <- ncol(x);x[, c(n-1, n)]})))
要按顺序获取数据,我们可以
cbind(df1[1], df1[-1][order(match(sub("\\..*", "", names(df1)[-1]), month.name))])
数据
df <- structure(list(Product = structure(1:10, .Label = c("a", "b",
"c", "d", "e", "f", "g", "h", "i", "j"), class = "factor"), January...2 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), January...3 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), January...4 = c(754, 706, 517,
1466, 543, NA, NA, NA, 1466, NA), January...5 = c(4L, 3L, 3L,
9L, 8L, NA, NA, NA, 8L, NA), February...6 = c(754, 706, 517,
1466, 543, NA, NA, NA, NA, 543), February...7 = c(4L, 3L, 3L,
9L, 8L, NA, NA, NA, NA, 3L), February...8 = c(754, 706, 517,
1466, 543, NA, NA, NA, NA, NA), February...9 = c(4L, 3L, 3L,
9L, 8L, NA, NA, NA, NA, NA), March...10 = c(754, 706, 517, 1466,
543, NA, NA, NA, NA, NA), March...11 = c(4L, 3L, 3L, 9L, 8L,
NA, NA, NA, NA, NA), March...12 = c(754, 706, 517, 1466, 543,
NA, NA, NA, NA, NA), March...13 = c(4L, 3L, 3L, 9L, 8L, NA, NA,
NA, NA, NA)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10"))