我有一个.txt数据集,其中前12行是文本,后跟2个空白行,然后是数据
DATE HEIGHT INPUT OUTPUT TESTMEASURE
01/01/1933 NO RECORD NO RECORD MISSING MISSING
01/02/1933 NO RECORD NO RECORD MISSING MISSING
但是当我做了
dat <- fread('data.txt'),
它跳过15行,并使用第一个数据行作为导入数据集的列名。它忽略了标题行。
01/01/1933 NO RECORD NO RECORD MISSING MISSING
skip参数不会影响我导入的内容。如何提及需要用作列名的行号。或者,我可以重命名列名,但不应忽略第一行数据。
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.001319 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... '\t'
Detected 5 columns. Longest stretch was from line 15 to line 30
Starting data input on line 15 (either column names or first row of data). First 10 characters: 01/01/1933
The line before starting line 15 is non-empty and will be ignored (it has too few or too many items to be column names or data): DATE HEIGHT INPUT OUTPUT TESTMEASURE the fields on line 15 are character fields. Treating as the column names.
答案 0 :(得分:2)
您有12行文字,2行空格,然后是您的数据。但我注意到DATE
和HEIGHT
之间有额外的空格。因此,制作一个这样的文本文件,您的数据以制表符分隔,并在DATE
和HEIGHT
之间添加 2个标签,而不是 1个标签
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
DATE HEIGHT INPUT OUTPUT TESTMEASURE
01/01/1933 NO RECORD NO RECORD MISSING MISSING
01/02/1933 NO RECORD NO RECORD MISSING MISSING
做fread(data)
给了我:
fread(data)
01/01/1933 NO RECORD NO RECORD MISSING MISSING
1: 01/02/1933 NO RECORD NO RECORD MISSING MISSING
删除DATE
和HEIGHT
之间的额外标签会给我:
DATE HEIGHT INPUT OUTPUT TESTMEASURE
1: 01/01/1933 NO RECORD NO RECORD MISSING MISSING
2: 01/02/1933 NO RECORD NO RECORD MISSING MISSING