Question

我正在尝试拆分大型数据集和

使用循环和
将所有单个数据再次保存在单个堆叠文件中

我正在使用如下的一些示例数据：

首先，我根据第一列中的源数将数据集拆分为2，并使用以下代码读入列表：

out <- split( sample , f = sample$Source)

现在我正在努力设置一个循环来更改coloumn 2到8的colnames 通过将现有的colnames与以下'info'表匹配，并根据'info'表的第一列中的源名称进行替换。

信息表如下所示：

所以循环应该更改类似于此的colnames：

我只是想知道是否有人做过类似的事可以告诉我？

当我尝试将它们连接在一起时，我只能使用merge函数设置colnames。是否有任何方法来堆叠它们，以便我可以保留每个表的colname，看起来像这样？：

我的示例输入文件是：

> dput(sample)
structure(list(Source = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L), .Label = c("Stack 1", "Stack 2"), class = "factor"), 
    year = c(2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 
    2010L, 2010L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
    ), hour = c(0L, 1L, 2L, 3L, 0L, 1L, 2L, 3L, 4L), `EXIT VEL` = c(26.2, 
    26.2, 26.2, 26.2, 22.4, 22.4, 22.4, 22.4, 22.4), TEMP = c(341L, 
    341L, 341L, 341L, 328L, 328L, 328L, 328L, 328L), `STACK DIAM` = c(1.5, 
    1.5, 1.5, 1.5, 2.5, 2.5, 2.5, 2.5, 2.5), W = c(0L, 0L, 0L, 
    0L, 15L, 15L, 15L, 15L, 15L), Nox = c(39, 39, 39, 39, 33.3, 
    33.3, 33.3, 33.3, 33.3), Sox = c(15.5, 15.5, 15.5, 15.5, 
    17.9, 17.9, 17.9, 17.9, 17.9)), .Names = c("Source", "year", 
"day", "hour", "EXIT VEL", "TEMP", "STACK DIAM", "W", "Nox", 
"Sox"), class = "data.frame", row.names = c(NA, -9L))

> dput(stack_info)
structure(list(SNAME = structure(1:2, .Label = c("Stack 1", "Stack 2"
), class = "factor"), ISVARY = c(1L, 4L), VELVOL = c(1L, 4L), 
    TEMPDENS = c(0L, 2L), `DUM 1` = c(999L, 999L), `DUM 2` = c(999L, 
    999L), NPOL = c(2L, 2L), `EXIT VEL` = c(26.2, 22.4), TEMP = c(341L, 
    328L), `STACK DIAM` = c(1.5, 2.5), W = c(0L, 15L), Nox = c(39, 
    33.3), Sox = c(15.5, 17.9)), .Names = c("SNAME", "ISVARY", 
"VELVOL", "TEMPDENS", "DUM 1", "DUM 2", "NPOL", "EXIT VEL", "TEMP", 
"STACK DIAM", "W", "Nox", "Sox"), class = "data.frame", row.names = c(NA, 
-2L))

提前致谢

Answer 1

我最好的结果就是：

out <- split( sample , f = sample$Source) # your original step

stack_info[,1] <- as.character(stack_info[,1]) # To get strings column as strings and not index number later
out <- lapply( names(out), function(x) {
                      # Get the future names
                      new_cnames <- unname(unlist(stack_info[stack_info$SNAME == x,1:7]))
                      # replace the column names
                      colnames(out[[x]]) <- c("Source",new_cnames,colnames(out[[x]])[9:10] )
                      # Return the modified version without first column
                      out[[x]][,-1]  })

sapply(out,write.table,append=T,file="",row.names=F,sep="|") # write (change "" to the file name you wish and sep to your desired separator and see ?write.table for more documentation)

主要思想是循环DF以更改其colnames，我会更新列表并再次循环写入，您可能希望在第一个循环中附加到文件。

我希望这些评论足以获取代码，告诉我它是否需要一些细节。

屏幕输出（省略警告）：

 "Stack 1"|"1"|"1.1"|"0"|"999"|"999.1"|"2"|"Nox"|"Sox"
2010|1|0|26.2|341|1.5|0|39|15.5
2010|1|1|26.2|341|1.5|0|39|15.5
2010|1|2|26.2|341|1.5|0|39|15.5
2010|1|3|26.2|341|1.5|0|39|15.5
"Stack 2"|"4"|"4.1"|"2"|"999"|"999.1"|"2.1"|"Nox"|"Sox"
2010|1|0|22.4|328|2.5|15|33.3|17.9
2010|1|1|22.4|328|2.5|15|33.3|17.9
2010|1|2|22.4|328|2.5|15|33.3|17.9
2010|1|3|22.4|328|2.5|15|33.3|17.9
2010|1|4|22.4|328|2.5|15|33.3|17.9

R - 使用包含来自另一个表

1 个答案: