我的文件没有标题,它位于一列中,并且每第21行都比其余部分长。因此,它不会读取这些行中的其余行。使它起作用的唯一方法是直接在文件头中放入一行,但是我希望避免这种情况,因为我有很多文件,以后会引起麻烦,因为以后必须合并这些文件。到目前为止,我已经尝试了类似strsplit()
命令的其他操作。这是我的一部分数据:
1533541940,90,123,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1
1533541941,90,124,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1
1533541941,90,125,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1
1533541944,90,126,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1,#,#,28.00,41.00,#,0,0.60,1.60,#,496,#,450,16,46560,16,173800,#,28.41,45.93,1017.19,135383.00
1533541945,90,127,0,658.06,13.00,50620,0.0000,-1,-1,-1,-1
1533541945,90,128,0,658.06,13.00,50620,0.0000,-1,-1,-1,-1
我只有很少的编程经验,因此请问您是否可以用半普通的英语回答,因为我还不太懂编程语言。感谢您能提供的任何帮助,谢谢。
答案 0 :(得分:1)
这应该可以解决问题
res <- read.csv(text = "1533541940,90,123,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1
1533541941,90,124,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1
1533541941,90,125,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1
1533541944,90,126,0,656.45,13.00,50496,0.0000,-1,-1,-1,-1,#,#,28.00,41.00,#,0,0.60,1.60,#,496,#,450,16,46560,16,173800,#,28.41,45.93,1017.19,135383.00
1533541945,90,127,0,658.06,13.00,50620,0.0000,-1,-1,-1,-1
1533541945,90,128,0,658.06,13.00,50620,0.0000,-1,-1,-1,-1", header = FALSE)
您还可以提供文件路径作为read.csv
的参数
输出:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25
1 1533541940 90 123 0 656.45 13 50496 0 -1 -1 -1 -1 NA NA NA NA NA NA NA NA
2 1533541941 90 124 0 656.45 13 50496 0 -1 -1 -1 -1 NA NA NA NA NA NA NA NA
3 1533541941 90 125 0 656.45 13 50496 0 -1 -1 -1 -1 NA NA NA NA NA NA NA NA
4 1533541944 90 126 0 656.45 13 50496 0 -1 -1 -1 -1 # # 28 41 # 0 0.6 1.6 # 496 # 450 16
5 1533541945 90 127 0 658.06 13 50620 0 -1 -1 -1 -1 NA NA NA NA NA NA NA NA
6 1533541945 90 128 0 658.06 13 50620 0 -1 -1 -1 -1 NA NA NA NA NA NA NA NA
V26 V27 V28 V29 V30 V31 V32 V33
1 NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA
4 46560 16 173800 # 28.41 45.93 1017.19 135383
5 NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA
如果不需要第V12列之后的数据
res < res[,1:12]
更新-在评论中回答问题:
res2 <- readLines("res.csv", encoding = "utf-8")
res2 <- strsplit(res2, ",")
data.table::rbindlist(lapply(res2,
function(x) as.data.frame(matrix(x,
nrow = 1))),
fill = TRUE)
答案 1 :(得分:0)
您可以使用readlines()和read.csv()自动执行此操作:
read.csv(text= readLines("yourfile.csv", encoding = "utf-8"), header = F)
编辑:如您的注释中所述,列数是从前六行得出的。为了确保您拥有所有列(并且不关心数据的顺序),可以运行:
# library(stringr)
a <- readLines("yourfile.csv", encoding = "utf-8", sep=",") #this will get all the data in the lines
b <- order(sapply(a, function(x){a <- str_count(x, ","); return(a)}),decreasing = T) # This will make sure the longest one will be first so you have the maximum nbr of columns
read.csv(text= a[b], header = F)