fwrite:按列追加

时间:2018-10-24 19:41:14

标签: r data.table

我想将 by columns 矩阵附加到现有文件中,而不绑定以前在R中的矩阵。这是一个示例:

A <- data.table(matrix(1:9, 3))
names(A) <- as.character(1:3)

B <- data.table(matrix(10:18, 3))
names(B) <- as.character(4:6)

fwrite(A, file = "test.txt", sep = " ", row.names = FALSE, col.names = FALSE)

fwrite(B, file = "test.txt", sep = " ", row.names = FALSE, col.names = FALSE, append = T)

我试图更改列名而没有成功。结果如下:

> fread("test.txt")
   V1 V2 V3
1:  1  4  7
2:  2  5  8
3:  3  6  9
4: 10 13 16
5: 11 14 17
6: 12 15 18

这就是我想要的:

1 4 7 10 13 16
2 5 8 11 14 17
3 6 9 12 15 18

我知道在我的示例中,我可以只运行AB <- cbind(A, B)而不仅仅是运行fwrite(AB),但是实际上,鉴于AB非常大,我不能这样做矩阵,我没有足够的内存来分配组合矩阵。

请注意,这可能不适用于fwrite(),因此我对其他方法持开放态度。


修改
我通过调换矩阵找到了一个临时解决方案:

A <- data.table(t(matrix(1:9, 3)))
B <- data.table(t(matrix(10:18, 3)))

fwrite(A, file = "test.txt", sep = " ", row.names = FALSE, col.names = FALSE)
fwrite(B, file = "test.txt", sep = " ", row.names = FALSE, col.names = FALSE, append = T)

> t(fread("test.txt"))
   [,1] [,2] [,3] [,4] [,5] [,6]
V1    1    4    7   10   13   16
V2    2    5    8   11   14   17
V3    3    6    9   12   15   18

这种解决方案并不理想,因此我仍然期待有人提出更好的解决方案。

1 个答案:

答案 0 :(得分:0)

您可以尝试SQL方式。

数据

library(data.table)

A <- data.table(matrix(1:9, 3))
names(A) <- paste0("col", as.character(1:3))
A$id <- row.names(A)

B <- data.table(matrix(10:18, 3))
names(B) <- paste0("col", as.character(4:6))
B$id <- row.names(B)

fwrite(A, file = "A.txt", sep = " ", row.names = FALSE, col.names = TRUE)
fwrite(B, file = "B.txt", sep = " ", row.names = FALSE, col.names = TRUE)

使用RSQLite

library(RSQLite)
db <- dbConnect(SQLite(), dbname = "Test.sqlite")
dbWriteTable(conn = db, name = "tab1", value = "A.txt", row.names = FALSE, header = TRUE, sep = " ")
dbWriteTable(conn = db, name = "tab2", value = "B.txt", row.names = FALSE, header = TRUE, sep = " ")

# Retreive columns in a table (excluding id)
col1 <- dbListFields(db, "tab1")         
col2 <- dbListFields(db, "tab2")         
col1 <- col1[!col1 %in% "id"]
col2 <- col2[!col2 %in% "id"]

#Append Columns using Join by the id (row names) created
sql_q <- paste0('CREATE TABLE tab3 AS SELECT ', 
                 paste('t1.', col1, collapse = ", ", sep = ""), ', ', 
                 paste('t2.', col2, collapse = ", ", sep = ""), ' FROM tab1 t1 INNER JOIN tab2 t2 ON t1.id = t2.id')

> sql_q
[1] "CREATE TABLE tab3 AS SELECT t1.col1, t1.col2, t1.col3, t2.col4, t2.col5, t2.col6 FROM tab1 t1 INNER JOIN tab2 t2 ON t1.id = t2.id"

dbSendQuery(conn = db, statement = sql_q)
dbGetQuery(db, 'SELECT * FROM tab3')

> dbGetQuery(db, 'SELECT * FROM tab3')
  col1 col2 col3 col4 col5 col6
1    1    4    7   10   13   16
2    2    5    8   11   14   17
3    3    6    9   12   15   18