Question

我在Rstudio中使用以下data。我正在尝试为数据分配列名。我使用了以下命令：

nlsdata -> read.table("C:/Users/perdue/Desktop/Adv.MicroEconometrics/HA 3/data/nls.dat", header = FALSE, dec = ".")

^此命令返回第一行列名＆＃39; v1 v2 v3 ... v52＆＃39;。当我跟着

colnames(nlsdata)

我得到一个名单：v1 v2 ... v52

col.names(nlsdata) <-c("inputid","nearc2","nearc4","nearc4a","nearc4b","ed76","ed66","age76","daded","nodaded","momed","nomomed","weight","momdad14","sinmom14","step14","reg661","reg662","reg663","reg664","reg665","reg666","reg667","reg668","reg669","south66","work76","work78","lwage76","lwage78","famed","black","smsa76r","smsa78r","reg76r","reg78r","reg80r","smsa66r","wage76","wage78","wage80","noint78","enroll76","enroll78","enroll80","kww","iq","marsta76","marsta78","marsta80","libcrd14") where newname[i] is the ith column name of dataframe nlsdata

Error: unexpected symbol in ""south66","work76","work78","lwage76","lwage78","famed","black","smsa76r","smsa78r","reg76r","reg78r","reg80r","smsa66r","wage76","wage78","wage80", "noint78","enroll76","enroll78","enroll80","

错误消息似乎表明语法错误。我已经不止一次了 - 并且无法找到/认出一个。

Answer 1

正确的代码是

nlsdata<-read.table("C:/Users/name/Desktop/nls.dat", header = FALSE, skip = 1, dec = ".")

然后使用colnames添加列名

colnames(nlsdata)<-c("inputid","nearc2","nearc4","nearc4a","nearc4b","ed76","ed66","age76","daded","nodaded","momed","nomomed","weight","momdad14","sinmom14","step14","reg661","reg662","reg663","reg664","reg665","reg666","reg667","reg668","reg669","south66","work76","work78","lwage76","lwage78","famed","black","smsa76r","smsa78r","reg76r","reg78r","reg80r","smsa66r","wage76","wage78","wage80","noint78","enroll76","enroll78","enroll80","kww","iq","marsta76","marsta78","marsta80","libcrd14")

并检查

head(nlsdata)

Answer 2

您获得的错误似乎与这两个引号有关：

""south66"

readr::read_table()很好地阅读了该文件：

library(readr)
url <- "https://raw.githubusercontent.com/108michael/ms_thesis/ca258bc684c3a6f8ade13769590439ad1e8387d7/nls.dat"
col_names = c("inputid","nearc2","nearc4","nearc4a","nearc4b","ed76","ed66",
  "age76","daded","nodaded","momed","nomomed","weight","momdad14","sinmom14",
  "step14","reg661", "reg662","reg663","reg664","reg665","reg666","reg667",
  "reg668","reg669","south66","work76","work78","lwage76","lwage78","famed",
  "black","smsa76r","smsa78r","reg76r","reg78r","reg80r","smsa66r","wage76",
  "wage78","wage80","noint78","enroll76","enroll78","enroll80",
  "kww","iq","marsta76","marsta78","marsta80","libcrd14")
read_table( url, col_names = col_names, na = "." )
#> Parsed with column specification:
#> cols(
#>   .default = col_integer(),
#>   daded = col_double(),
#>   momed = col_double(),
#>   lwage76 = col_double(),
#>   lwage78 = col_double()
#> )
#> See spec(...) for full column specifications.
#> # A tibble: 3,613 x 51
#>    inputid nearc2 nearc4 nearc4a nearc4b  ed76  ed66 age76 daded nodaded
#>      <int>  <int>  <int>   <int>   <int> <int> <int> <int> <dbl>   <int>
#>  1       2      0      0       0       0     7     5    29  9.94       1
#>  2       3      0      0       0       0    12    11    27  8.00       0
#>  3       4      0      0       0       0    12    12    34 14.00       0
#>  4       5      1      1       1       0    11    11    27 11.00       0
#>  5       6      1      1       1       0    12    12    34  8.00       0
#>  6       7      1      1       1       0    12    11    26  9.00       0
#>  7       8      1      1       1       0    18    16    33 14.00       0
#>  8       9      1      1       1       0    14    13    29 14.00       0
#>  9      10      1      1       1       0    12    12    28 12.00       0
#> 10      11      1      1       1       0    12    12    29 12.00       0
#> # ... with 3,603 more rows, and 41 more variables: momed <dbl>,
#> #   nomomed <int>, weight <int>, momdad14 <int>, sinmom14 <int>,
#> #   step14 <int>, reg661 <int>, reg662 <int>, reg663 <int>, reg664 <int>,
#> #   reg665 <int>, reg666 <int>, reg667 <int>, reg668 <int>, reg669 <int>,
#> #   south66 <int>, work76 <int>, work78 <int>, lwage76 <dbl>,
#> #   lwage78 <dbl>, famed <int>, black <int>, smsa76r <int>, smsa78r <int>,
#> #   reg76r <int>, reg78r <int>, reg80r <int>, smsa66r <int>, wage76 <int>,
#> #   wage78 <int>, wage80 <int>, noint78 <int>, enroll76 <int>,
#> #   enroll78 <int>, enroll80 <int>, kww <int>, iq <int>, marsta76 <int>,
#> #   marsta78 <int>, marsta80 <int>, libcrd14 <int>

如何为表指定列名

2 个答案: