我尝试使用https://data.worldbank.org/indicator/IS.AIR.PSGR
中的read.csv导入csv格式的数据然而,read.csv
函数返回:
read.table(file = file, header = header, sep = sep, quote = quote,
中的错误: 比列名更多的列。
我搜索过以前的帖子,但看起来答案在实际data tables
的情况下有所不同,那么这个问题出了什么问题?
答案 0 :(得分:0)
问题是由于前4行有随机文本。您需要使用skip = 4
。使用read_csv
包中的readr
会更好,因为它会保留原始列名称。
library(readr)
dat <- read_csv("API_IS.AIR.PSGR_DS2_en_csv_v2.csv", skip = 4)
#> Warning: Missing column names filled in: 'X63' [63]
#> Parsed with column specification:
#> cols(
#> .default = col_integer(),
#> `Country Name` = col_character(),
#> `Country Code` = col_character(),
#> `Indicator Name` = col_character(),
#> `Indicator Code` = col_character(),
#> `1960` = col_character(),
#> `1961` = col_character(),
#> `1962` = col_character(),
#> `1963` = col_character(),
#> `1964` = col_character(),
#> `1965` = col_character(),
#> `1966` = col_character(),
#> `1967` = col_character(),
#> `1968` = col_character(),
#> `1969` = col_character(),
#> `1995` = col_double(),
#> `2007` = col_double(),
#> `2008` = col_double(),
#> `2009` = col_double(),
#> `2010` = col_double(),
#> `2011` = col_double()
#> # ... with 7 more columns
#> )
#> See spec(...) for full column specifications.
head(dat)
#> # A tibble: 6 x 63
#> `Country Name` `Country Code` `Indicator Name` `Indicator Code` `1960`
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Aruba ABW Air transport, pa~ IS.AIR.PSGR <NA>
#> 2 Afghanistan AFG Air transport, pa~ IS.AIR.PSGR <NA>
#> 3 Angola AGO Air transport, pa~ IS.AIR.PSGR <NA>
#> 4 Albania ALB Air transport, pa~ IS.AIR.PSGR <NA>
#> 5 Andorra AND Air transport, pa~ IS.AIR.PSGR <NA>
#> 6 Arab World ARB Air transport, pa~ IS.AIR.PSGR <NA>
#> # ... with 58 more variables: `1961` <chr>, `1962` <chr>, `1963` <chr>,
#> # `1964` <chr>, `1965` <chr>, `1966` <chr>, `1967` <chr>, `1968` <chr>,
#> # `1969` <chr>, `1970` <int>, `1971` <int>, `1972` <int>, `1973` <int>,
#> # `1974` <int>, `1975` <int>, `1976` <int>, `1977` <int>, `1978` <int>,
#> # `1979` <int>, `1980` <int>, `1981` <int>, `1982` <int>, `1983` <int>,
#> # `1984` <int>, `1985` <int>, `1986` <int>, `1987` <int>, `1988` <int>,
#> # `1989` <int>, `1990` <int>, `1991` <int>, `1992` <int>, `1993` <int>,
#> # `1994` <int>, `1995` <dbl>, `1996` <int>, `1997` <int>, `1998` <int>,
#> # `1999` <int>, `2000` <int>, `2001` <int>, `2002` <int>, `2003` <int>,
#> # `2004` <int>, `2005` <int>, `2006` <int>, `2007` <dbl>, `2008` <dbl>,
#> # `2009` <dbl>, `2010` <dbl>, `2011` <dbl>, `2012` <dbl>, `2013` <dbl>,
#> # `2014` <dbl>, `2015` <dbl>, `2016` <dbl>, `2017` <chr>, X63 <chr>
由reprex package(v0.2.0)于2018-03-05创建。
答案 1 :(得分:0)
在通过docker运行时,我遇到了类似的问题。因此,我必须先下载文件,然后再读取csv文件。
# download data
download.file("https://data.worldbank.org/indicator/IS.AIR.PSGR", dest = "file.csv")
# load data
gm = read.table("file.csv", header = T, stringsAsFactors = F, skipNul = F)