我想弄清楚如何将CSV文件拆分成小块。我希望按任意数量或行分割。也许是20,1,000或者其他什么。
setwd("C:/Users/my_path/test_folder/")
mydata = read.csv("NHLData.csv")
split(mydata, ceiling(seq_along(mydata)/20))
错误:警告消息:在split.default中(x = seq_len(nrow(x)),f = f,drop = drop,...):数据长度不是拆分变量的倍数
我也尝试过这个。
split(mydata, ceiling(seq_along(mydata)/(length(mydata)/20)))
相同错误:警告消息:在split.default(x = seq_len(nrow(x))中,f = f,drop = drop,...):数据长度不是拆分变量的倍数
我用Google搜索了这些想法。我没有找到任何有用的东西。这一定非常简单,对吧。
答案 0 :(得分:0)
利用'样本'功能,这会有所帮助。
setwd("C:/Users/my_path/test_folder/")
mydata = read.csv("NHLData.csv")
# If you want 5 different chunks with same number of lines, lets say 30.
Chunks = split(mydata,sample(rep(1:5,30))) ## 5 Chunks of 30 lines each
# If you want 20 samples, put any range of 20 values within the range of number of rows
First_chunk <- sample(mydata[1:20,]) ## this would contain first 20 rows
# Or you can print any number of rows within the range
Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows.
# If you want to write these chunks out in a csv file:
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T)
write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T)
希望这会有所帮助。