这让我难以理解最近的愚蠢错误。
我正在处理名为log08t
的data.table,当我通过在命令行输入其名称来查看它时,它会出现此错误:
log08t
Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent In addition: Warning message: In cbind(time =
c("2017-11-08 12:38:09", "2017-11-08 12:38:09", : number of rows of
result is not a multiple of vector length (arg 1)
当我通过str
查看其结构时,它看起来像这样。
str(log08t)
Classes ‘data.table’ and 'data.frame': 5389 obs. of 19 variables:
$ time : POSIXct, format: "2017-11-08 12:38:09" "2017-11-08 12:38:09" "2017-11-08 12:38:09" "2017-11-08 12:38:09" ...
$ type : chr "API-XML" "API-XML" "MySQL" "MySQL" ...
$ id : num 40192 40193 4131 4131 4131 ...
$ gap :Class 'difftime' atomic [1:5389] 2.59e+01 0.00 2.71e-01 2.12e-02 3.05e-04 ...
.. ..- attr(*, "units")= chr "secs"
$ bunch2 : num 24 24 24 24 24 24 24 24 24 24 ...
$ service_name: chr "GetMyTodaysSessions" "GetMyCurrentSession" "SELECT" "SELECT" ...
$ table : chr NA NA NA "class_sessions" ...
$ user_id : chr NA NA NA NA ...
$ code : chr NA NA NA NA ...
$ from : chr NA NA NA NA ...
$ to : chr NA NA NA NA ...
$ input_string: chr "Service : GetMyTodaysSessions; UserId : 499" "Service : GetMyCurrentSession; UserId : 499" NA NA ...
$ contents : chr "5299; 2017-11-08 07:57:41; 2017-11-08 08:27:41; 6; Sanjay; 499; 17; 6th grade Physics section A; 12; Room 12A; "| __truncated__ NA "select current_timestamp" "select term.class_session_id from class_sessions as term inn..." ...
$ break_cat : chr "block13" "block13" "block14" "block14" ...
$ break_serv : chr "batch1" "batch2" "batch1" "batch1" ...
$ shftime : POSIXct, format: "2017-11-08 12:37:43" "2017-11-08 12:38:09" "2017-11-08 12:38:09" "2017-11-08 12:38:09" ...
$ bunch : int 23 24 24 24 24 24 24 24 24 24 ...
$ datetext : chr "2017-11-08 12:38:09" "2017-11-08 12:38:09" "2017-11-08 12:38:09" "2017-11-08 12:38:09" ...
$ timesec :Formal class 'Period' [package "lubridate"] with 6 slots
.. ..@ .Data : num 1.51e+09 1.51e+09 1.51e+09 1.51e+09 1.51e+09 ...
.. ..@ year : num 0 0 0 0 0 0 0 0 0 0 ...
.. ..@ month : num 0 0 0 0 0 0 0 0 0 0 ...
.. ..@ day : num 0 0 0 0 0 0 0 0 0 0 ...
.. ..@ hour : num 0 0 0 0 0 0 0 0 0 0 ...
.. ..@ minute: num 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "sorted")= chr "time"
- attr(*, ".internal.selfref")=<externalptr> enter code here
我可以看到它的尺寸:
dim(log08t)
# [1] 5389 19
我可以计算其行数,并查看列名称。
> nrow(log08t)
# [1] 5389
> NROW(log08t$time)
# [1] 5389
> NROW(log08t$timesec)
# [1] 5389
> names(log08t)
# [1] "time" "type" "id" "gap" "bunch2" "service_name" "table" "user_id" "code" "from" "to" "input_string"
# [13] "contents" "break_cat" "break_serv" "shftime" "bunch" "datetext" "timesec"
任何尝试完整地查看它或子集(所有列,几行)都会引发错误
但是列的一部分可以工作,
log08t[,.(type,time)][1:10]
type time
1: API-XML 2017-11-08 12:38:09
2: API-XML 2017-11-08 12:38:09
3: MySQL 2017-11-08 12:38:09
4: MySQL 2017-11-08 12:38:09
5: MySQL 2017-11-08 12:38:09
6: MySQL 2017-11-08 12:38:09
7: MySQL 2017-11-08 12:38:09
8: MySQL 2017-11-08 12:38:09
9: MySQL 2017-11-08 12:38:09
10: MySQL 2017-11-08 12:38:09
我确信,罪魁祸首是最后一列timesec
:我添加此列后,错误开始了。见这里,
log08t[,.(type,time,timesec)]
# Error in dimnames(x) <- dn :
# length of 'dimnames' [1] not equal to array extent
# In addition: Warning message:
# In cbind(type = c("API-XML", "API-XML", "MySQL", "MySQL", "MySQL", :
# number of rows of result is not a multiple of vector length (arg 1)
当我删除列时,这是正常的,
> log08t[,timesec:=NULL]
> log08t
time type id gap bunch2 service_name table user_id code from to input_string
1: 2017-11-08 12:38:09 API-XML 40192 2.586546e+01 secs 24 GetMyTodaysSessions NA NA NA NA NA Service : GetMyTodaysSessions; UserId : 499
2: 2017-11-08 12:38:09 API-XML 40193 0.000000e+00 secs 24 GetMyCurrentSession NA NA NA NA NA Service : GetMyCurrentSession; UserId : 499
3: 2017-11-08 12:38:09 MySQL 4131 2.713320e-01 secs 24 SELECT NA NA NA NA NA NA
4: 2017-11-08 12:38:09 MySQL 4131 2.119088e-02 secs 24 SELECT class_sessions NA NA NA NA NA
5: 2017-11-08 12:38:09 MySQL 4131 3.051758e-04 secs 24 SELECT student_class_map NA NA NA NA NA
---
5385: 2017-11-08 13:14:25 MySQL 4355 1.583099e-03 secs 129 SELECT tbl_auth NA NA NA NA NA
5386: 2017-11-08 13:14:25 MySQL 4355 3.561974e-04 secs 129 SELECT schools NA NA NA NA NA
5387: 2017-11-08 13:14:25 MySQL 4355 4.777908e-04 secs 129 SELECT seats NA NA NA NA NA
5388: 2017-11-08 13:14:25 MySQL 4355 3.828907e-02 secs 129 SELECT student_class_map NA NA NA NA NA
5389: 2017-11-08 13:14:25 MySQL 4355 4.160404e-04 secs 129 SELECT NA NA NA NA NA
NA
我想知道最后一列有什么问题,即使出现问题,为什么data.table不能用NA或NULL替换值并继续前进?