examdata <- RCurl::getURL("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt")
examdata2 <- read.table(textConnection(examdata), sep = ",", header = T)
扫描错误(文件,什么,nmax,sep,dec,quote,skip,nlines, na.strings,:第1行没有2个元素
答案 0 :(得分:7)
看起来你只需要跳过几行。我使用readLines(textConnection(examdata))
来确定实际数据表的开始位置。原来它从第32行开始。因此,我们可以使用skip
中的read.csv
参数跳过前31行。我使用了strip.white
参数,因为表中似乎有一些错误的空格。
(df <- read.csv(text = examdata, skip = 31L, strip.white = TRUE))
# Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7 3.9 10.1 14.4 4.4 2.3 58.7
# 2 Average Transaction Value $21 $168 $56 $44 $216 $69 $59
# 3 Value of Payments in % 14 19 16 18 27 5 100
由于您可能希望这些数字为数字,因此您需要删除$
符号并将列转换为数字,这样您就可以将它们用于以后可能执行的任何计算
df[-1] <- lapply(df[-1], function(x) as.numeric(sub("[$]", "", x)))
df
# Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7 3.9 10.1 14.4 4.4 2.3 58.7
# 2 Average Transaction Value 21.0 168.0 56.0 44.0 216.0 69.0 59.0
# 3 Value of Payments in % 14.0 19.0 16.0 18.0 27.0 5.0 100.0
现在除了第一列之外的所有列都是数字。
答案 1 :(得分:0)
read.table
和read.csv
会将网址作为路径并为您处理连接,因此您并非真正需要RCurl
:
read.csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt",
skip = 31)
## Type Cash Check Credit Debit Electronic Other Total
## 1 Average Number of Purchases 23.7 3.9 10.1 14.4 4.4 2.3 58.7
## 2 Average Transaction Value $21 $168 $56 $44 $216 $69 $59
此外,如果您使用readr::read_csv
,您可以告诉它将列解析为数字,在读取时删除$
个字符:
library(readr)
read_csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt",
skip = 31,
col_types = cols(Type = 'c', .default = 'n')) # c = character, n = number
## # A tibble: 2 × 8
## Type Cash Check Credit Debit Electronic Other Total
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Average Number of Purchases 23.7 3.9 10.1 14.4 4.4 2.3 58.7
## 2 Average Transaction Value 21.0 168.0 56.0 44.0 216.0 69.0 59.0
答案 2 :(得分:0)
试试:
df <- read.csv("x.csv",... ,**quote = "", fill=TRUE**)