读取csv,跳过三行,但在data.frame中包含标题名称

时间:2016-01-05 21:27:47

标签: r csv

我想读取csv文件跳过三行(标题除外),但在data.frame中包含标题名称。我已尝试过关注但标题名称错误:

> sine = read.csv(file="sine.csv",head=TRUE,sep=",", skip=3, check.names=TRUE)
> colnames(sine)
 [1] "X0"     "X0.0"   "X0.0.1" "X0.0.2" "None"   "X1.0"   "X0.0.3" "None.1" "X.."   
[10] "X0.1"   "X0.2"

当我读取数据集时没有跳过三行标题名称就可以了:

> sine = read.csv(file="sine.csv",head=TRUE,sep=",", skip=0, check.names=TRUE)
> colnames(sine)
 [1] "reset"                                                                                    
 [2] "angle"                                                                                    
 [3] "sine"                                                                                     
 [4] "multiStepPredictions.actual"                                                              
 [5] "multiStepPredictions.1"                                                                   
 [6] "anomalyScore"                                                                             
 [7] "multiStepBestPredictions.actual"                                                          
 [8] "multiStepBestPredictions.1"                                                               
 [9] "anomalyLabel"                                                                             
[10] "multiStepBestPredictions.multiStep.errorMetric..altMAPE..steps..1..window.1000.field.sine"
[11] "multiStepBestPredictions.multiStep.errorMetric..aae..steps..1..window.1000.field.sine"    

我做错了什么?

1 个答案:

答案 0 :(得分:3)

这样的事,

foo <- read.csv("http://www.ats.ucla.edu/stat/r/faq/test.csv", header=T)
foo
#    make   model mpg weight price
# 1   amc concord  22   2930  4099
# 2   amc   oacer  17   3350  4749
# 3   amc  spirit  22   2640  3799
# 4 buick century  20   3250  4816
# 5 buick electra  15   4080  7827
colnames(foo)
# [1] "make"   "model"  "mpg"    "weight" "price" 

bar <- read.csv("http://www.ats.ucla.edu/stat/r/faq/test.csv", header=T, skip=3)
bar
#     amc  spirit X22 X2640 X3799
# 1 buick century  20  3250  4816
# 2 buick electra  15  4080  7827
colnames(bar)
# [1] "amc"    "spirit" "X22"    "X2640"  "X3799" 

正如Richard Scriven所指出的那样,我的初步答案不起作用,不知道我是如何错过的。找到了this SO answer并在下面找到了解决方案。

all_content = readLines("http://www.ats.ucla.edu/stat/r/faq/test.csv")
skip_second = all_content[c(c(-2:-4))]
foo2 = read.csv(textConnection(skip_second), 
                header = TRUE, stringsAsFactors = FALSE)
foo2
#    make   model mpg weight price
# 1 buick century  20   3250  4816
# 2 buick electra  15   4080  7827
colnames(foo2)
# [1] "make"   "model"  "mpg"    "weight" "price"