我在Windows操作系统上使用R 3.1.3 32位,并且有一个csv文件 - 逗号分隔 - 有8列和1001行(包括标题)(整个数据集是24000+行)。
我的目标是拉出“网站”名称中包含至少一个“HOSPITAL”,“ROYAL”“TRUST”字样的所有行。
> datac <- read.csv("data1c.csv", header = TRUE, colClasses = c("character", "character", "character", "character", "character", "character", "character", "character")))
Error: unexpected ')' in "datac <- read.csv("data1c.csv", header = TRUE, colClasses = c("character", "character", "character", "character", "character", "character", "character", "character")))"
和
> read.csv("data1c.csv", header = TRUE, col.names = c("ODS","Site","NGrouping", "Address1", "Address2", "Address3", "Address4", "Postcode")
Error in match.arg(numerals) : 'arg' should be one of “allow.loss”, “warn.loss”, “no.loss”
和
> subset("data1c.csv", Site=="HOSPITAL")
Error in subset.default("data1c.csv", Site == "HOSPITAL") : object 'Site' not found
和
> x <- matrix(rnorm(8008, 1), ncol = 8)
> y <- c(1, seq(8))
> x <- cbind(x, y)
Warning message:
In cbind(x, y) :
number of rows of result is not a multiple of vector length (arg 2)
我对此非常新,所以任何帮助都会非常感激。
答案 0 :(得分:0)
对于你的第一个错误,你有一个额外的)
是结束(三个而不是两个)。
对于第二个,您忘记将列名列表放在向量中,因此read.csv
将其视为额外参数,执行:
read.csv("data1c.csv", header = TRUE, col.names =c("ODS","Site","NGrouping", "Address1", "Address2", "Address3", "Address4", "Postcode")
对于第三个,子集的第一个参数必须是data.frame
:
subset(datac, Site%in%c("HOSPITAL", "ROYAL", "TRUST"))
对于警告,x
有1001行和8列,而y
是长度为9(1 + length(seq(8))
)的向量,这正是警告告诉您的。因此,您必须从y
中删除一个项目,或向x