我在表格的列中遇到特殊字符问题。
以下是数据示例:
structure(list(shipType = structure(c(1L, 3L, 1L, 2L, 4L), .Label = c("CARGO",
"FISHING", "TOWING_LONG_WIDE", "UNKNOWN"), class = "factor"),
shipCargo = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "UNDEFINED", class = "factor"),
destination = structure(c(3L, 1L, 2L, 4L, 5L), .Label = c("\\KORSOR ;.,NA,.\\",
"LEHTMA", "RIGA", "TALLIN", "VYBORG"), class = "factor"),
eta = structure(c(1L, 2L, 5L, 3L, 4L), .Label = c("01/01 00:00 UTC",
"01/01 09:00 UTC", "24/12 16:00 UTC", "26/12 07:00 UTC",
"30/12 16:00 UTC"), class = "factor"), imo = structure(c(3L,
5L, 1L, 4L, 2L), .Label = c("7101891", "7406318", "9066045",
"9158185", "Russia"), class = "factor"), callsign = structure(c(5L,
1L, 2L, 3L, 4L), .Label = c("12", "UALB", "UBYK8", "UFPC",
"UICC"), class = "factor"), country = structure(c(2L, 1L,
2L, 2L, 2L), .Label = c("2014-12-29", "Russia"), class = "factor"),
month = c(12L, 1L, 12L, 12L, 12L), date = structure(c(2L,
1L, 2L, 2L, 2L), .Label = c("", "2014-12-29"), class = "factor"),
week = c(1L, NA, 1L, 1L, 1L), X = c(NA, NA, NA, NA, NA)), .Names = c("shipType",
"shipCargo", "destination", "eta", "imo", "callsign", "country",
"month", "date", "week", "X"), class = "data.frame", row.names = c(NA,
-5L))
正如您在第二行中看到的那样,列#34;目的地"使用以下代码读取文件时
data <- read.table(file, header=T, fill=T, sep=",")
我尝试了不同的东西,例如:使用引号导出而不使用标题
data <- read.table(file, sep=",", fill=T, head=F, quote="")
然后删除第一行(表中的实际标题...)并再添加一次这些标题
data <- data[-1,]
colnames(data)<-c( "shipType", "shipCargo","destination","eta","imo","callsign", "country","month","date","week")
它看起来更好,但是有很多特殊字符,这将是耗时/错误的来源(我有很多表..)来编辑。
有没有办法避免在导入文件时搞乱列?
谢谢!