我有一个像这样的ndjson
{"start_time_last":"2019-02-24T00:07:25.875Z","start_time_first":"2019-02-24T00:07:25.875Z","device_id":"8160a3f87a977379f12f8826fd3c9c86ca3ca48a"}
{"start_time_last":"2019-02-24T00:48:56.100Z","start_time_first":"2019-02-24T00:40:24.464Z","device_id":"181606aabbf155217f59e302541638bfc7e07837"}
{"start_time_last":"2019-02-23T21:57:36.024Z","start_time_first":"2019-02-23T21:56:06.741Z","device_id":"1b62573cdfdab3902b72ec9e4797c422271f2efd"}
如您所见,每条记录都显示了设备的活动期,我的问题是我是否要生成一个具有两个字段的ndjson。一个是“时间戳记”,它经过2019-02-23T00:00、2019-02-23T00:01,...,2019-02-24T23:59(以分钟为单位的时间戳记),并计算有效的不同device_id的唯一性按每个时间戳记。
例如,对于第一条记录,它从2019-02-24T00:07:25.875Z开始到2019-02-24T00:07:25.875Z结束。此设备ID应该以时间戳记:
2019-02-24T00:07
仅在此分钟内出现。为了获得第二个记录,应该将其计入这些时间戳记
2019-02-24T00:40,
2019-02-24T00:41,
2019-02-24T00:42,
2019-02-24T00:43,
2019-02-24T00:44,
2019-02-24T00:45,
2019-02-24T00:46,
2019-02-24T00:47,
2019-02-24T00:48
如何使用jq实现此功能?还是Bash中的任何命令?
答案 0 :(得分:2)
据我了解,您希望(每“分钟”)对活动设备的数量进行计数。这是一个假定输入为每个“ device_id”指定不重叠间隔的解决方案:
def seconds:
# strips fractional seconds
"\(.[:-5])Z" | fromdateiso8601;
def record($s; $e):
reduce range($s | round; $e + 1) as $i (.; .[$i|todate] += 1);
reduce inputs as $in ({}; record( ($in | .start_time_first | seconds / 60); ($in | .start_time_last | seconds / 60)))
适当的调用如下所示:
jq -n -f program.jq input.ndjson
答案 1 :(得分:1)
据我了解,您正在将这些对象作为输入,并在每分钟timestamp
和start_time_*
的{{1}}日期范围内,生成一系列带有device_id
的对象。
如果您的操作系统和内部版本支持日期功能,则可以使用它来生成json。
def fromdateiso8601wd:
# strips fractional seconds
"\(.[:-5])Z" | fromdateiso8601;
[
(.start_time_first | fromdateiso8601wd),
(.start_time_last | fromdateiso8601wd + 60)
] as [$s, $e] |
{
timestamp: (range($s; $e; 60) | todateiso8601),
device_id
}
会产生:
{
"timestamp": "2019-02-24T00:07:25Z",
"device_id": "8160a3f87a977379f12f8826fd3c9c86ca3ca48a"
}
{
"timestamp": "2019-02-24T00:40:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:41:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:42:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:43:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:44:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:45:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:46:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:47:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:48:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-24T00:49:24Z",
"device_id": "181606aabbf155217f59e302541638bfc7e07837"
}
{
"timestamp": "2019-02-23T21:56:06Z",
"device_id": "1b62573cdfdab3902b72ec9e4797c422271f2efd"
}
{
"timestamp": "2019-02-23T21:57:06Z",
"device_id": "1b62573cdfdab3902b72ec9e4797c422271f2efd"
}
{
"timestamp": "2019-02-23T21:58:06Z",
"device_id": "1b62573cdfdab3902b72ec9e4797c422271f2efd"
}