列中的特殊字符:表中的混乱

时间:2015-10-14 11:29:34

标签: r

我在表格的列中遇到特殊字符问题。

以下是数据示例:

structure(list(shipType = structure(c(1L, 3L, 1L, 2L, 4L), .Label = c("CARGO", 
"FISHING", "TOWING_LONG_WIDE", "UNKNOWN"), class = "factor"), 
    shipCargo = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "UNDEFINED", class = "factor"), 
    destination = structure(c(3L, 1L, 2L, 4L, 5L), .Label = c("\\KORSOR  ;.,NA,.\\", 
    "LEHTMA", "RIGA", "TALLIN", "VYBORG"), class = "factor"), 
    eta = structure(c(1L, 2L, 5L, 3L, 4L), .Label = c("01/01 00:00 UTC", 
    "01/01 09:00 UTC", "24/12 16:00 UTC", "26/12 07:00 UTC", 
    "30/12 16:00 UTC"), class = "factor"), imo = structure(c(3L, 
    5L, 1L, 4L, 2L), .Label = c("7101891", "7406318", "9066045", 
    "9158185", "Russia"), class = "factor"), callsign = structure(c(5L, 
    1L, 2L, 3L, 4L), .Label = c("12", "UALB", "UBYK8", "UFPC", 
    "UICC"), class = "factor"), country = structure(c(2L, 1L, 
    2L, 2L, 2L), .Label = c("2014-12-29", "Russia"), class = "factor"), 
    month = c(12L, 1L, 12L, 12L, 12L), date = structure(c(2L, 
    1L, 2L, 2L, 2L), .Label = c("", "2014-12-29"), class = "factor"), 
    week = c(1L, NA, 1L, 1L, 1L), X = c(NA, NA, NA, NA, NA)), .Names = c("shipType", 
"shipCargo", "destination", "eta", "imo", "callsign", "country", 
"month", "date", "week", "X"), class = "data.frame", row.names = c(NA, 
-5L))

正如您在第二行中看到的那样,列#34;目的地"使用以下代码读取文件时

data <- read.table(file, header=T, fill=T, sep=",")

我尝试了不同的东西,例如:使用引号导出而不使用标题

data <- read.table(file, sep=",", fill=T, head=F, quote="")

然后删除第一行(表中的实际标题...)并再添加一次这些标题

data <- data[-1,]
colnames(data)<-c( "shipType", "shipCargo","destination","eta","imo","callsign", "country","month","date","week")

它看起来更好,但是有很多特殊字符,这将是耗时/错误的来源(我有很多表..)来编辑。

有没有办法避免在导入文件时搞乱列?

谢谢!

0 个答案:

没有答案