我正在处理一个复杂的矩阵(对我很复杂......)
这是这样的:
Invoice.1 Invoice.2 Invoice.3 mtime
1 21605000182 21605000183 NA 2017-01-16 19:51:33
2 21605000182 21605000183 NA 2017-01-16 19:51:33
3 21605000182 21605000183 NA 2017-01-16 19:51:33
4 21605000182 21605000183 NA 2017-01-16 19:51:33
5 21510000669 21602000125 21608000366 2017-01-20 13:28:36
6 21609000856 NA NA 2017-01-20 13:28:36
7 21606000405 21608000354 21608000356 2017-01-20 13:28:36
8 21610000133 NA NA 2017-01-20 13:28:36
9 21604000592 21605000604 21605000608 2017-01-20 13:28:36
10 21609001012 NA NA 2017-01-20 13:28:36
我想将所有Invoice列转换为一个,以便清理“NA”并复制,但是尊重每个列的匹配与最后一列的日期,即声明的日期。
类似的东西:
Invoice mtime
1 21605000182 2017-01-16 19:51:33
2 21605000182 2017-01-16 19:51:33
3 21605000182 2017-01-16 19:51:33
4 21605000182 2017-01-16 19:51:33
5 21510000669 2017-01-20 13:28:36
6 21609000856 2017-01-20 13:28:36
7 21606000405 2017-01-20 13:28:36
8 21610000133 2017-01-20 13:28:36
9 21604000592 2017-01-20 13:28:36
10 21609001012 2017-01-20 13:28:36
11 21605000183 2017-01-16 19:51:33
12 21605000183 2017-01-16 19:51:33
13 21605000183 2017-01-16 19:51:33
14 21605000183 2017-01-16 19:51:33
15 21602000125 2017-01-20 13:28:36
16 21608000354 2017-01-20 13:28:36
答案 0 :(得分:0)
使用data.table
的示例:(应该比使用其他致敬更快)
DT <- data.table(Invoice.1 = 1:3, Invoice.2 = c(1L,4L,5L), mtime = 11:13)
DT
Invoice.1 Invoice.2 mtime
1: 1 1 11
2: 2 4 12
3: 3 5 13
rez <- melt(DT, measure.vars = paste0("Invoice.", 1:2),
value.name = "Invoice")
rez[, variable := NULL]
rez
mtime Invoice
1: 11 1
2: 12 2
3: 13 3
4: 11 1
5: 12 4
6: 13 5
rez <- unique(rez)
rez
mtime Invoice
1: 11 1
2: 12 2
3: 13 3
4: 12 4
5: 13 5
答案 1 :(得分:0)
使用gather
包的tidyr
功能可以满足您的需求。 gather
会将data.frame
从宽格式转换为长格式。
library(tidyr)
library(readr)
# Create a temp file to store the example data
data_file <- tempfile()
cat(
"Invoice.1,Invoice.2,Invoice.3,mtime
21605000182,21605000183,NA,2017-01-16 19:51:33
21605000182,21605000183,NA,2017-01-16 19:51:33
21605000182,21605000183,NA,2017-01-16 19:51:33
21605000182,21605000183,NA,2017-01-16 19:51:33
21510000669,21602000125,21608000366,2017-01-20 13:28:36
21609000856,NA,NA,2017-01-20 13:28:36
21606000405,21608000354,21608000356,2017-01-20 13:28:36
21610000133,NA,NA,2017-01-20 13:28:36
21604000592,21605000604,21605000608,2017-01-20 13:28:36
21609001012,NA,NA,2017-01-20 13:28:36",
file = data_file,
append = FALSE)
# Read the data from the temp file into a data.frame called `invoices`
invoices <-
readr::read_csv(file = data_file, col_types = "cccT")
# View the data
invoices
# # A tibble: 10 x 4
# Invoice.1 Invoice.2 Invoice.3 mtime
# <chr> <chr> <chr> <dttm>
# 1 21605000182 21605000183 <NA> 2017-01-16 19:51:33
# 2 21605000182 21605000183 <NA> 2017-01-16 19:51:33
# 3 21605000182 21605000183 <NA> 2017-01-16 19:51:33
# 4 21605000182 21605000183 <NA> 2017-01-16 19:51:33
# 5 21510000669 21602000125 21608000366 2017-01-20 13:28:36
# 6 21609000856 <NA> <NA> 2017-01-20 13:28:36
# 7 21606000405 21608000354 21608000356 2017-01-20 13:28:36
# 8 21610000133 <NA> <NA> 2017-01-20 13:28:36
# 9 21604000592 21605000604 21605000608 2017-01-20 13:28:36
# 10 21609001012 <NA> <NA> 2017-01-20 13:28:36
# use the gather function from the tidyr package to transform the data from the
# wide format to a long format.
tidyr::gather(invoices, key = key, value = Invoice, -mtime, na.rm = TRUE) %>% print(n = Inf)
# # A tibble: 20 x 3
# mtime key Invoice
# * <dttm> <chr> <chr>
# 1 2017-01-16 19:51:33 Invoice.1 21605000182
# 2 2017-01-16 19:51:33 Invoice.1 21605000182
# 3 2017-01-16 19:51:33 Invoice.1 21605000182
# 4 2017-01-16 19:51:33 Invoice.1 21605000182
# 5 2017-01-20 13:28:36 Invoice.1 21510000669
# 6 2017-01-20 13:28:36 Invoice.1 21609000856
# 7 2017-01-20 13:28:36 Invoice.1 21606000405
# 8 2017-01-20 13:28:36 Invoice.1 21610000133
# 9 2017-01-20 13:28:36 Invoice.1 21604000592
# 10 2017-01-20 13:28:36 Invoice.1 21609001012
# 11 2017-01-16 19:51:33 Invoice.2 21605000183
# 12 2017-01-16 19:51:33 Invoice.2 21605000183
# 13 2017-01-16 19:51:33 Invoice.2 21605000183
# 14 2017-01-16 19:51:33 Invoice.2 21605000183
# 15 2017-01-20 13:28:36 Invoice.2 21602000125
# 16 2017-01-20 13:28:36 Invoice.2 21608000354
# 17 2017-01-20 13:28:36 Invoice.2 21605000604
# 18 2017-01-20 13:28:36 Invoice.3 21608000366
# 19 2017-01-20 13:28:36 Invoice.3 21608000356
# 20 2017-01-20 13:28:36 Invoice.3 21605000608