用多行标题写数据帧

时间:2020-10-18 19:46:44

标签: r

我正在尝试使用已存储在单独对象中的某些标头信息编写数据帧。有人知道该怎么做吗?

每次尝试时,数据帧都写入一列或不包含标题信息。我正在做一些数据清理,需要保存更新的数据框并附加相同的标头信息。

感谢您的帮助!

cars.tsv
=====
Dataset: mtcars

Date: 18 OCT 2020

mpg cyl disp    hp  drat    wt  qsec    vs  am  gear    carb
21  6   160 110 3.9 2.62    16.46   0   1   4   4
21  6   160 110 3.9 2.875   17.02   0   1   4   4
=====
head <- readr::read_lines("cars.tsv", n_max = 4)
df <- readr::read_tsv("cars.tsv", skip = 4)
readr::write_lines(c(head, df), "out_c.tsv")
readr::write_lines(list(head, df), "out_list.tsv")

1 个答案:

答案 0 :(得分:1)

以下功能将为您提供所需的数据框和文件路径:

write_special <- function(df, filepath) {
  string <- c(paste("Dataset:", deparse(substitute(df)), "\n"),
              paste("Date:", date(), "\n"))
              
  writeLines(string, filepath)
  suppressWarnings(write.table(df, filepath, append = TRUE, sep = "\t"))
}

例如,如果我这样做:

write_special(mtcars, "mtcars.txt")

然后,“ mtcars.txt”如下所示:

Dataset: mtcars 

Date: Sun Oct 18 20:19:13 2020 

"mpg"   "cyl"   "disp"  "hp"    "drat"  "wt"    "qsec"  "vs"    "am"    "gear"  "carb"
"Mazda RX4" 21  6   160 110 3.9 2.62    16.46   0   1   4   4
"Mazda RX4 Wag" 21  6   160 110 3.9 2.875   17.02   0   1   4   4
"Datsun 710"    22.8    4   108 93  3.85    2.32    18.61   1   1   4   1
"Hornet 4 Drive"    21.4    6   258 110 3.08    3.215   19.44   1   0   3   1
"Hornet Sportabout" 18.7    8   360 175 3.15    3.44    17.02   0   0   3   2
"Valiant"   18.1    6   225 105 2.76    3.46    20.22   1   0   3   1
"Duster 360"    14.3    8   360 245 3.21    3.57    15.84   0   0   3   4
"Merc 240D" 24.4    4   146.7   62  3.69    3.19    20  1   0   4   2
"Merc 230"  22.8    4   140.8   95  3.92    3.15    22.9    1   0   4   2
"Merc 280"  19.2    6   167.6   123 3.92    3.44    18.3    1   0   4   4
"Merc 280C" 17.8    6   167.6   123 3.92    3.44    18.9    1   0   4   4
"Merc 450SE"    16.4    8   275.8   180 3.07    4.07    17.4    0   0   3   3
"Merc 450SL"    17.3    8   275.8   180 3.07    3.73    17.6    0   0   3   3
"Merc 450SLC"   15.2    8   275.8   180 3.07    3.78    18  0   0   3   3
"Cadillac Fleetwood"    10.4    8   472 205 2.93    5.25    17.98   0   0   3   4
"Lincoln Continental"   10.4    8   460 215 3   5.424   17.82   0   0   3   4
"Chrysler Imperial" 14.7    8   440 230 3.23    5.345   17.42   0   0   3   4
"Fiat 128"  32.4    4   78.7    66  4.08    2.2 19.47   1   1   4   1
"Honda Civic"   30.4    4   75.7    52  4.93    1.615   18.52   1   1   4   2
"Toyota Corolla"    33.9    4   71.1    65  4.22    1.835   19.9    1   1   4   1
"Toyota Corona" 21.5    4   120.1   97  3.7 2.465   20.01   1   0   3   1
"Dodge Challenger"  15.5    8   318 150 2.76    3.52    16.87   0   0   3   2
"AMC Javelin"   15.2    8   304 150 3.15    3.435   17.3    0   0   3   2
"Camaro Z28"    13.3    8   350 245 3.73    3.84    15.41   0   0   3   4
"Pontiac Firebird"  19.2    8   400 175 3.08    3.845   17.05   0   0   3   2
"Fiat X1-9" 27.3    4   79  66  4.08    1.935   18.9    1   1   4   1
"Porsche 914-2" 26  4   120.3   91  4.43    2.14    16.7    0   1   5   2
"Lotus Europa"  30.4    4   95.1    113 3.77    1.513   16.9    1   1   5   2
"Ford Pantera L"    15.8    8   351 264 4.22    3.17    14.5    0   1   5   4
"Ferrari Dino"  19.7    6   145 175 3.62    2.77    15.5    0   1   5   6
"Maserati Bora" 15  8   301 335 3.54    3.57    14.6    0   1   5   8
"Volvo 142E"    21.4    4   121 109 4.11    2.78    18.6    1   1   4   2

请注意,我没有将文件命名为“ mtcars.tsv”,原因很简单,由于顶部有多余的行,因此现在不是 tsv文件。如果尝试将其读取为一个,则不会取回数据框。您将需要另一个函数来读取您定制的文件类型。像这样:

read_special <- function(filepath) {
  df <- read.table(filepath, skip = 4, header = TRUE)
  fileinfo <- readLines(filepath, n = 4)[c(1, 3)]
  attr(df, "Dataset") <- trimws(strsplit(fileinfo[1], ":")[[1]][2])
  attr(df, "Date") <- trimws(strsplit(fileinfo[2], ":")[[1]][2])
  df
}

这将返回带有附加信息作为属性的附加信息的数据框