读取制表符分隔文本文件的问题,read.table扫描(...)第7行中的错误没有4个元素

时间:2014-01-28 04:44:03

标签: r

我目前正在开展一个项目,但在阅读数据的最开始就被击败了。数据有四个变量,分别是“标签”,“书籍ID”,“书名”和“作者”。它们被“标签”分开,这里是它的快速浏览,

AMERICAN HISTORY    b15857527   These United States Unger, Irwin
AMERICAN HISTORY    b10957081   Cengage Advantage Books: American Passages  Ayers, Edward L.; Gould, Lewis L.; Oshinsky, David M.; Soderlund, Jean R.
AMERICAN HISTORY    b15131495   Voices of a People's History of the United States   Zinn, Howard; Arnove, Anthony

现在这里有我的R代码来阅读它,

train1<-read.table("train1.txt",sep="\t")

然后我收到了此错误消息,

扫描错误(文件,内容,nmax,sep,dec,quote,skip,nlines,na.strings,:   第7行没有4个元素

我使用readline函数检查第7行是否没有4个元素,但它似乎很好,

cat(readLines("train1.txt")[1:8], sep = "\n")
AMERICAN HISTORY    b15857527   These United States Unger, Irwin
AMERICAN HISTORY    b10957081   Cengage Advantage Books: American Passages  Ayers, Edward L.; Gould, Lewis L.; Oshinsky, David M.; Soderlund, Jean R.
AMERICAN HISTORY    b15131495   Voices of a People's History of the United States   Zinn, Howard; Arnove, Anthony
AMERICAN HISTORY    b15683513   American Realities  Youngs, J. William T.
AMERICAN HISTORY    b9418230    American History: A Survey, Volume 1    Brinkley, Alan
AMERICAN HISTORY    b14348885   Liberty, Equality, Power    Murrin, John M.; Johnson, Paul E.; McPherson, James M.; Gerstle, Gary; Fahs, Alice
AMERICAN HISTORY    b9372860    American History: A Survey, Volume 2    Brinkley, Alan
AMERICAN HISTORY    b9489206    Religion in America Hemeyer, Julia Corbett

我试图在原始的txt文件中手动调整它,但不管我做了什么,总是在另一条线上发生同样的错误,这看起来很好。我非常感谢您的慷慨帮助,谢谢!!

1 个答案:

答案 0 :(得分:0)

问题出在第3行的撇号中。read.table()将其后面的文本解释为第3行第3列的一个元素,直到结束撇号。定义,引用应该是什么:

train1 <- read.table("train1.txt", sep="\t", quote="\"")