我的数据就像(我的txt文件中没有第一行)
----*----1----*----2----*---
Region Value
New York, NY 66,834.6
Kings, NY 34,722.9
Bronx, NY 31,729.8
Queens, NY 20,453.0
San Francisco, CA 16,526.2
Hudson, NJ 12,956.9
Suffolk, MA 11,691.6
Philadelphia, PA 11,241.1
Washington, DC 9,378.0
Alexandria IC, VA 8,552.2
我的尝试是
#fwf data2
path <- "fwfdata2.txt"
data6 <- read.fwf(path,
widths=c(17, -3, 8),
header=TRUE,
#sep=""
as.is=FALSE)
data6
回答
> data6
Region.................Value
New York, NY 66,834.6
Kings, NY 34,722.9
Bronx, NY 31,729.8
Queens, NY 20,453.0
San Francisco, CA 16,526.2
Hudson, NJ 12,956.9
Suffolk, MA 11,691.6
Philadelphia, PA 11,241.1
Washington, DC 9,378.0
Alexandria IC, VA 8,552.2
> dim(data6)
[1] 10 1
问题在于,我的数据用“,”和“”分隔。当我添加sep =“”时,它将产生如下错误。
Error in read.table(file = FILE, header = header, sep = sep, row.names = row.names, :
more columns than column names
答案 0 :(得分:2)
我认为您的问题是read.fwf
期望标题是隔离的,并且数据要固定宽度:
header: a logical value indicating whether the file contains the
names of the variables as its first line. If present, the
names must be delimited by ‘sep’.
sep: character; the separator used internally; should be a
character that does not occur in the file (except in the
header).
我跳过标题来读取数据,然后通过只读第一行来读取标题:
> data = read.fwf(path,widths=c(17,-3,8),head=FALSE,skip=1,as.is=TRUE)
> heads = read.fwf(path,widths=c(17,-3,8),head=FALSE,n=1,as.is=TRUE)
> names(data)=heads[1,]
> data
Region Value
1 New York, NY 66,834.6
2 Kings, NY 34,722.9
3 Bronx, NY 31,729.8
4 Queens, NY 20,453.0
5 San Francisco, CA 16,526.2
6 Hudson, NJ 12,956.9
7 Suffolk, MA 11,691.6
8 Philadelphia, PA 11,241.1
9 Washington, DC 9,378.0
10 Alexandria IC, VA 8,552.2
如果您希望将Region
作为一个因素,那么在阅读数据时请使用as.is=FALSE
(如您的示例所示),但在阅读标题时必须使用as.is=TRUE
否则会转换为{{1}}数字。
您是否还想将区域拆分为以逗号分隔的部分,并将逗号分隔的数字转换为数字值?你没有说。