读入制表符分隔文件,并在多行上观察

时间:2018-04-29 14:24:37

标签: r import tab-delimited-text

解答:在扫描功能中指定引用选项解决了问题

我在制表符分隔的文本文件中有一个数据集,其中包含名称,地址和属性信息。我试过了

land <- as.data.frame(scan(paste0(dirdata, "land.txt" ),what=list(CFullName="", CAddr1= "", CAddr2="", Amount="",CDate= "", Financer= "", Filing="", Sched= "", Office="", Dist= "", County= "", Municipality= ""), sep = "\t" , flush = TRUE, multi.line=TRUE, skip= 20, na.strings = "NA"))

在下面的数据中,每次观察应该只有三行。第一个是名称,第二个是街道地址,第三个是城市,包含剩余的变量。

MORRIS, WILLIAM
1111 POLARIS PKWY OH1-1213
COLUMBUS, OH 43240  12.00   15-MAY-12   J. P. MORGAN CHASE & CO.    2012 July   A   N/A     N/A     N/A     N/A
JACOBS, WILLIAM
3477 COURTLAND DRIVE
LEWIS CENTER, OH 43035  12.00   30-SEP-09   J. P. MORGAN CHASE & CO.    2009 11     A   N/A     N/A     N/A     N/A
MARTIN, WILLIAM
3477 COURTLAND DRIVE
LEWIS CENTER, OH 43035  12.00   31-DEC-09   J. P. MORGAN CHASE & CO.    2010 January A  N/A     N/A     N/A     N/A
MICHAELS, WILLIAM
1111 POLARIS PKWY OH1-1213
COLUMBUS, OH 43240  12.00   15-AUG-12   J. P. MORGAN CHASE & CO.    2012    A   N/A     N/A     N/A     N/A
BROWN, WILLIAM
1111 POLARIS PKWY OH1-0213
COLUMBUS, OH 43240  12.00   31-JUL-13   J. P. MORGAN CHASE & CO. PAC    2013 32 A   N/A     N/A     N/A     N/A
BAXTER, WILLIAM
3477 COURTLAND DRIVE
LEWIS CENTER, OH 43035  12.00   15-AUG-09   J. P. MORGAN CHASE & CO. PAC    2009 11     A   N/A     N/A     N/A     N/A

我的代码正确地读取了前2,517个观察值,但是当基于txt文件应该至少有12,000个观察值时停止。我的代码给出的错误是:

EOF within quoted stringnumber of items read is not a multiple of the number of columnsRead 2517 records

谢谢!

0 个答案:

没有答案