R:无效' sep' value:必须是一个字节

时间:2015-04-20 06:49:17

标签: r

我正在尝试阅读使用::作为列分隔符的文件:

userID::MovieID::Rating::Timestamp
1::1193::5::978300760
1::661::3::978302109
1::914::3::978301968
1::3408::4::978300275

这是我的代码

tr = read.table("/home/user/ml-1m/ratings.dat",sep = ":"  )
print(tr)

结果是:

   V1 V2   V3 V4 V5 V6        V7
1   2 NA  318 NA  5 NA 978298413
2   2 NA 1207 NA  4 NA 978298478
3   2 NA 1968 NA  2 NA 978298881
4   2 NA 3678 NA  3 NA 978299250
5   2 NA 1244 NA  3 NA 978299143
6   2 NA  356 NA  5 NA 978299686
7   2 NA 1245 NA  2 NA 978299200

我不想要NA值 但如果我设置sep="::",则会出现错误invalid 'sep' value: must be one byte 我该如何解决这个问题?

1 个答案:

答案 0 :(得分:8)

文本文件导入功能仅支持单个字符作为列分隔符。但是,您可以告诉read.table忽略要导入的列及其colClasses参数(请参阅帮助文件):

read.table(text = "userID::MovieID::Rating::Timestamp
1::1193::5::978300760
1::661::3::978302109
1::914::3::978301968
1::3408::4::978300275", 
           sep = ":", colClasses = c(NA, "NULL"),
           header = TRUE)

#  userID MovieID Rating Timestamp
#1      1    1193      5 978300760
#2      1     661      3 978302109
#3      1     914      3 978301968
#4      1    3408      4 978300275