我目前正在开展一个项目,但在阅读数据的最开始就被击败了。数据有四个变量,分别是“标签”,“书籍ID”,“书名”和“作者”。它们被“标签”分开,这里是它的快速浏览,
AMERICAN HISTORY b15857527 These United States Unger, Irwin
AMERICAN HISTORY b10957081 Cengage Advantage Books: American Passages Ayers, Edward L.; Gould, Lewis L.; Oshinsky, David M.; Soderlund, Jean R.
AMERICAN HISTORY b15131495 Voices of a People's History of the United States Zinn, Howard; Arnove, Anthony
现在这里有我的R代码来阅读它,
train1<-read.table("train1.txt",sep="\t")
然后我收到了此错误消息,
扫描错误(文件,内容,nmax,sep,dec,quote,skip,nlines,na.strings,: 第7行没有4个元素
我使用readline函数检查第7行是否没有4个元素,但它似乎很好,
cat(readLines("train1.txt")[1:8], sep = "\n")
AMERICAN HISTORY b15857527 These United States Unger, Irwin
AMERICAN HISTORY b10957081 Cengage Advantage Books: American Passages Ayers, Edward L.; Gould, Lewis L.; Oshinsky, David M.; Soderlund, Jean R.
AMERICAN HISTORY b15131495 Voices of a People's History of the United States Zinn, Howard; Arnove, Anthony
AMERICAN HISTORY b15683513 American Realities Youngs, J. William T.
AMERICAN HISTORY b9418230 American History: A Survey, Volume 1 Brinkley, Alan
AMERICAN HISTORY b14348885 Liberty, Equality, Power Murrin, John M.; Johnson, Paul E.; McPherson, James M.; Gerstle, Gary; Fahs, Alice
AMERICAN HISTORY b9372860 American History: A Survey, Volume 2 Brinkley, Alan
AMERICAN HISTORY b9489206 Religion in America Hemeyer, Julia Corbett
我试图在原始的txt文件中手动调整它,但不管我做了什么,总是在另一条线上发生同样的错误,这看起来很好。我非常感谢您的慷慨帮助,谢谢!!
答案 0 :(得分:0)
问题出在第3行的撇号中。read.table()
将其后面的文本解释为第3行第3列的一个元素,直到结束撇号。定义,引用应该是什么:
train1 <- read.table("train1.txt", sep="\t", quote="\"")