我正在使用R.3.3.1。在工作PC上的RStudio 0.99.903。
我有一个标签分隔的文件,我试图用fread读入。不幸的是,有些行以双标签结尾,而其他排不是。
这是我数据的前几行:
[1] "1054434\t01-01-2015\t-1\tAMOUNT OWN MUSIC\t12\t\t"
[2] "1054434\t01-01-2015\t-1\tDVDS\t2\t"
[3] "1054434\t01-01-2015\t-1\tINIT TV\t2\t\t"
[4] "1054434\t01-01-2015\t-1\tINIT2\t4\t\t"
[5] "1054434\t01-01-2015\t-1\tIntro_other_TV\t2\t\t"
我以为我可以使用fill = TRUE选项解决这个问题,但是我收到此错误消息:
test<-fread(filenames[1], header = FALSE, fill = TRUE)
Error in fread(filenames[1], header = FALSE, fill = TRUE) :
unused argument (fill = TRUE)
根据帮助文件,我不明白为什么填充不起作用,因为它绝对是一个有效的选项......
我正在使用data.table 1.9.6。来自CRAN,当我尝试安装github版本时收到此错误消息:
* installing *source* package 'data.table' ...
** libs
*** arch - i386
Warning: running command 'make -f "Makevars" -f "D:/R- 33~1.1/etc/i386/Makeconf" -f "D:/R-33~1.1/share/make/winshlib.mk" SHLIB="data.table.dll" OBJECTS="assign.o bmerge.o chmatch.o dogroups.o fastmean.o fcast.o fmelt.o forder.o frank.o fread.o fsort.o fwrite.o gsumm.o ijoin.o init.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o subset.o transpose.o uniqlist.o vecseq.o wrappers.o"' had status 127
ERROR: compilation failed for package 'data.table'
* removing 'D:/R-3.3.1/library/data.table'
Warning in install.packages :
running command '"D:/R-33~1.1/bin/x64/R" CMD INSTALL -l "D:\R-3.3.1\library" C:\Users\swiftc47\AppData\Local\Temp\RtmpeYBevK/downloaded_packages/data.table_1.9.7.tar.gz' had status 1
Warning in install.packages :
installation of package ‘data.table’ had non-zero exit status
答案 0 :(得分:1)
1.9.6没有fill
选项 - 尝试更新到当前CRAN版本(1.9.8+)fill = TRUE
正常工作:
fread("test.tsv", fill = TRUE)
# V1 V2 V3 V4 V5 V6 V7
# 1: 1054434 01-01-2015 -1 AMOUNT OWN MUSIC 12 NA NA
# 2: 1054434 01-01-2015 -1 DVDS 2 NA NA
# 3: 1054434 01-01-2015 -1 INIT TV 2 NA NA
# 4: 1054434 01-01-2015 -1 INIT2 4 NA NA
# 5: 1054434 01-01-2015 -1 Intro_other_TV 2 NA NA
其中test.tsv
是您的文件。
除此之外,您可以使用命令行工具修剪尾随空格;我不熟悉sed
,所以我使用this个问题作为参考:
fread("sed 's/[ \t]*$//' test.tsv")
# V1 V2 V3 V4 V5
# 1: 1054434 01-01-2015 -1 AMOUNT OWN MUSIC 12
# 2: 1054434 01-01-2015 -1 DVDS 2
# 3: 1054434 01-01-2015 -1 INIT TV 2
# 4: 1054434 01-01-2015 -1 INIT2 4
# 5: 1054434 01-01-2015 -1 Intro_other_TV 2
最后一个选项是将双\t
替换为一个,以防您想要一列NA
:
fread("sed 's/[ \t][ \t]$/\t/' ~/Desktop/test.tsv")
# V1 V2 V3 V4 V5 V6
# 1: 1054434 01-01-2015 -1 AMOUNT OWN MUSIC 12 NA
# 2: 1054434 01-01-2015 -1 DVDS 2 NA
# 3: 1054434 01-01-2015 -1 INIT TV 2 NA
# 4: 1054434 01-01-2015 -1 INIT2 4 NA
# 5: 1054434 01-01-2015 -1 Intro_other_TV 2 NA
答案 1 :(得分:-2)
您可以使用
执行此操作read.table(filenames[1],fill=TRUE)