Question

我正在使用的文件本质上是多个表的数据转储。我正在尝试对数据执行两项操作。首先，在每个表上将文件拆分为多个文件。我有这个工作。其次，我想创建一个具有以下结构的数据框：

| table_name       | count | headers                   |
|------------------|-------|---------------------------|
| sample_table     | 5     | header_1,header2,header 3 |
| sample_table_two | 2     | header_1,header_2         |

“标题”列应该是字符串的数组/向量。每当我尝试运行代码时，都会收到以下错误消息：

There were 50 or more warnings (use warnings() to see the first 50)

警告输出的采样：

 Warning messages:
1: In rbind(names(probs), probs_f) :
  number of columns of result is not a multiple of vector length (arg 1)
2: In x[[jj]][iseq] <- vjj :
  number of items to replace is not a multiple of replacement length

colNames的结构着眼于我的期望：

 $ X4: chr "vendor_uid"
 $ X5: chr "error_code"
 $ X6: chr "date"
 $ X7: chr "status"
 $ X8: chr "is_rebuild"

但是，关于newRow的值，最后一个条目不是我期望的（我以为它将是向量）。

 $ : chr "amphire_status"
 $ : int 1
 $ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   1 obs. of  5 variables:
  ..$ X4: chr "vendor_uid"
  ..$ X5: chr "error_code"
  ..$ X6: chr "date"
  ..$ X7: chr "status"
  ..$ X8: chr "is_rebuild"

在tableDF中，其他两列看起来很合理，但是“标题”列仅包含第一列名称/标题。

'data.frame':   2 obs. of  3 variables:
 $ tableName: chr  "access_log" "amphire_status"
 $ count    : int  81 1
 $ headers  :List of 2
  ..$ : chr "access_uid"
  ..$ : chr "vendor_uid"

我在做什么错。如果这是我不熟悉R的明显错误，我对此表示歉意。我已引用this question。，但仍无法解决我的问题。

编辑：不敢相信我忘了添加代码：

library(tidyverse)
library(jsonlite)


pollFile <- read_csv("sample.csv", col_names = FALSE)
has_data <- function(x) { sum(!is.na(x)) > 0 }
tableDF <- data.frame("tableName" = character(), "count" = integer(), "headers" = vector(), stringsAsFactors = FALSE)

for(name in levels(factor(pollFile$X1) )){

  tmp <- pollFile %>% filter(X1 == name) %>% select(-(X1:X3)) 
  tmp <- tmp %>% select_if(has_data)
  colNames <- tmp[1,]
  newRow <- list(name, nrow(tmp), as.vector(colNames))
  tableDF[nrow(tableDF) + 1, ] <- newRow

  if(count(tmp) > 1) {
    fn <- paste("poll_table/", name,".csv")
    write_csv(tmp,fn,col_names = FALSE)
  }

}

R-如何将行/观测添加到具有向量列类型的数据框中

0 个答案: