扫描错误(文件,什么,nmax,sep,dec,quote,skip,nlines,na.strings,:第1行没有2个元素

时间:2014-10-25 19:49:39

标签: r

examdata <- RCurl::getURL("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt")

examdata2 <- read.table(textConnection(examdata), sep = ",", header = T)
  

扫描错误(文件,什么,nmax,sep,dec,quote,skip,nlines,   na.strings,:第1行没有2个元素

3 个答案:

答案 0 :(得分:7)

看起来你只需要跳过几行。我使用readLines(textConnection(examdata))来确定实际数据表的开始位置。原来它从第32行开始。因此,我们可以使用skip中的read.csv参数跳过前31行。我使用了strip.white参数,因为表中似乎有一些错误的空格。

(df <- read.csv(text = examdata, skip = 31L, strip.white = TRUE))
#                          Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
# 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59
# 3      Value of Payments in %   14    19     16    18         27     5   100

由于您可能希望这些数字为数字,因此您需要删除$符号并将列转换为数字,这样您就可以将它们用于以后可能执行的任何计算

df[-1] <- lapply(df[-1], function(x) as.numeric(sub("[$]", "", x)))
df
#                          Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
# 2   Average Transaction Value 21.0 168.0   56.0  44.0      216.0  69.0  59.0
# 3      Value of Payments in % 14.0  19.0   16.0  18.0       27.0   5.0 100.0

现在除了第一列之外的所有列都是数字。

答案 1 :(得分:0)

read.tableread.csv会将网址作为路径并为您处理连接,因此您并非真正需要RCurl

read.csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
         skip = 31)

##                          Type Cash Check Credit Debit Electronic Other Total
## 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
## 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59

此外,如果您使用readr::read_csv,您可以告诉它将列解析为数字,在读取时删除$个字符:

library(readr)

read_csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
         skip = 31, 
         col_types = cols(Type = 'c', .default = 'n'))    # c = character, n = number

## # A tibble: 2 × 8
##                          Type  Cash Check Credit Debit Electronic Other Total
##                         <chr> <dbl> <dbl>  <dbl> <dbl>      <dbl> <dbl> <dbl>
## 1 Average Number of Purchases  23.7   3.9   10.1  14.4        4.4   2.3  58.7
## 2   Average Transaction Value  21.0 168.0   56.0  44.0      216.0  69.0  59.0

答案 2 :(得分:0)

试试:

df <- read.csv("x.csv",... ,**quote = "", fill=TRUE**)