Question

我有一些这种格式的数据，我想在R中导入，到目前为止，我使用read.csv读取它们，但是每一段都以自己的行结尾。

connectionString

到目前为止还不错，但是我需要使用以下格式：

18.07.19
05:41:05
Information
18.07.19
05:43:48
Something
18.07.19
05:20:48
Text
18.07.19
01:16:45

因为我想将数据用作数据框。

我认为dcast可能是正确的方法，但我无法弄清楚必须作为参数传递的内容。

Answer 1

这是您使用data.table::dcast的一种技巧，因为您提到过：

x <- read.csv(header=FALSE, stringsAsFactors=FALSE, text="
18.07.19
05:41:05
Information
18.07.19
05:43:48
Something
18.07.19
05:20:48
Text
18.07.19
01:16:45")

x$i <- head(rep(1:3, times=ceiling(nrow(x) / 3)), n = nrow(x))
x$j <- head(rep(1:ceiling(nrow(x)), each=3), n = nrow(x))

data.table::dcast(x, j ~ i, value.var="V1")
#   j        1        2           3
# 1 1 18.07.19 05:41:05 Information
# 2 2 18.07.19 05:43:48   Something
# 3 3 18.07.19 05:20:48        Text
# 4 4 18.07.19 01:16:45        <NA>

（您可以轻松删除j并重命名列名。）

Answer 2

基于R的另一种黑客方式，因为它只是一个单列数据框。我们将NA附加到列中缺少的值上，并通过指定列数来用值填充矩阵。

n <- 3
data.frame(matrix(c(df$V1, rep(NA, n - length(df$V1) %% n)),ncol = 3,byrow = TRUE))

#        X1       X2          X3
#1 18.07.19 05:41:05 Information
#2 18.07.19 05:43:48   Something
#3 18.07.19 05:20:48        Text
#4 18.07.19 01:16:45        <NA>

数据

df <- structure(list(V1 = c("18.07.19", "05:41:05", "Information", 
"18.07.19", "05:43:48", "Something", "18.07.19", "05:20:48", 
"Text", "18.07.19", "01:16:45")), row.names = c(NA, -11L), class = 
"data.frame")

将行合并到R中的数据帧

2 个答案: