如何使用逗号作为十进制标记的“cols()”和“col_double”

时间:2017-04-05 15:03:24

标签: r csv readr

我想在阅读时用readr包解析我的列到正确的类型。

难度:字段以分号(;)分隔,而逗号(,)用作十进制标记。

library(readr)

# Test data:
T <- "Date;Time;Var1;Var2
      01.01.2011;11:11;2,4;5,6
      02.01.2011;12:11;2,5;5,5
      03.01.2011;13:11;2,6;5,4
      04:01.2011;14:11;2,7;5,3"

read_delim(T, ";")
# A tibble: 4 × 4
#              Date     Time  Var1  Var2
#             <chr>   <time> <dbl> <dbl>
# 1       01.01.2011 11:11:00    24    56
# 2       02.01.2011 12:11:00    25    55
# 3       03.01.2011 13:11:00    26    54
# 4       04:01.2011 14:11:00    27    53

所以,我认为解析的东西会像这样工作,但我总是得到错误信息:

read_delim(T, ";", cols(Date = col_date(format = "%d.%m.%Y")))
# Error: expecting a string

同样在这里:

read_delim(T, ";", cols(Var1 = col_double()))
# Error: expecting a string

我认为我做的事情从根本上说是错误的。 ;)

另外,我很感激如何告诉read_delim将逗号理解为十进制标记。 read.delim dec = ","可以很容易地使用col_euro_double,但我真的想从头开始使用“readr”-Package而不会挣扎。前一版本中有一个{ "_id" : "uJtBt8mM2pbTYfwND", "createdAt" : ISODate("2017-04-03T22:40:14.678Z"), "pollName" : "First Poll", "entryOwner" : "gdAHxDrxFuTvYiFt8", "question" : [ { "name" : "Question number 1", "questionId" : "xgYQxGxpwBXaQpjXN", "options" : [ { "name" : "John", "votes" : 0 }, { "name" : "Adam", "votes" : 0 }, { "name" : "Robert", "votes" : 0 } ] }, { "name" : "Question number 2", "questionId" : "zviwYHHsaATBdG6Jw", "options" : [ { "name" : "John", "votes" : 0 }, { "name" : "Adam", "votes" : 0 }, { "name" : "Robert", "votes" : 0 } ] } ], }函数,但它已被删除。现在有哪些替代方案?

1 个答案:

答案 0 :(得分:3)

使用locale=

时指定read_delim()
read_delim(T, ";", locale=locale(decimal_mark = ","))
#               Date       Time  Var1  Var2
#              <chr>     <time> <dbl> <dbl>
# 1       01.01.2011 40260 secs   2.4   5.6
# 2       02.01.2011 43860 secs   2.5   5.5
# 3       03.01.2011 47460 secs   2.6   5.4
# 4       04:01.2011 51060 secs   2.7   5.3