我们创建了一个MOOC课程,其中记录系统记录了所有内容(点击,态度,视频查看等)。 100-150名学生报名参加了这门课程。
作为这项研究的结果,我们得到了一个日志文件(json)。随着R i准备了这个数据帧:
log_data <- ndjson::stream_in("log-export-20160721_1030.json")
dplyr::glimpse(log_data)
Observations: 1,443,817
Variables: 22
$ _id.$oid <chr> "5707a89dcbbb4d92129ee44c", "5707a89...
$ data <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ page <chr> "http://elearning.szte.hu/mod/szte/f...
$ pid <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,...
$ time <chr> "2016.04.08. 14:48:24.691", "2016.04...
$ type <chr> "load", "mousemove", "mousemove", "m...
$ user <chr> "3", "3", "3", "3", "3", "3", "3", "...
$ data.realDistance <dbl> NA, 0.00000, 366.87055, 241.45600, N...
$ data.x <dbl> NA, 139, 176, 261, NA, 245, 1905, 21...
$ data.xDistance <dbl> NA, 0, 37, 85, NA, 16, NA, 111, NA, ...
$ data.y <dbl> NA, 29, 394, 620, NA, 761, 553, 451,...
$ data.yDistance <dbl> NA, 0, 365, 226, NA, 141, NA, 310, N...
$ data.text <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.top <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.target <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.filename <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.length <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.actualTime <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.src <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.totalTime <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.videoId <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ data.seekTime <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
我的问题是:
如何计算用户的日志数量?
如何按用户对数据表进行分组,拆分或分离?