这是文本文件的示例。我需要从pipleline(“ |”)之前的字符串开始按秒数来计算单词“ Id”组
2019-02-10 12:00:03.448|Id: 26102338
2019-02-10 12:00:03.448|Id: 25941418
2019-02-10 12:00:03.449|Id: 25827373
2019-02-10 12:00:03.449|Id: 26102038
2019-02-10 12:00:03.449|Id: 25929358
2019-02-10 12:00:04.382 | =====================================Start
fetching=====================================
2019-02-10 12:00:04.451 |
2019-02-10 12:00:04.426|Id: 25713118
2019-02-10 12:00:04.426|Id: 26076208
2019-02-10 12:00:04.426|Id: 26079643
2019-02-10 12:00:04.426|Id: 26085973
2019-02-10 12:00:04.426|Id: 26090023
2019-02-10 12:00:04.426|Id: 26130133
2019-02-10 12:00:04.426|Id: 25954018
2019-02-10 12:00:04.427|Id: 25951468
2019-02-10 12:00:04.427|Id: 26136148
2019-02-10 12:00:04.427|Id: 26103013
2019-02-10 12:00:04.427|Id: 25806433
我需要这样输出:
Time |Count(Id)
2019-02-10 12:00:03|5
2019-02-10 12:00:04|11
有人可以帮忙吗?
答案 0 :(得分:1)
如果每行最后总是有一个kubeadm join 10.109.x.xx:6443 --token 3j9fzw.h7jxrseyrvm04s7v --discovery-token-ca-cert-hash sha256:5b20e87a257ea5551d8f5b3e1d502de099b4811d6b0e6062ad571fa97f5acb
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.1. Latest validated version: 18.06
[discovery] Trying to connect to API Server "10.109.x.xx:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.109.x.xx:6443"
[discovery] Requesting info from "https://10.109.x.xx:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.109.0.80:6443"
[discovery] Successfully established connection with API Server "10.109.0.80:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
unable to fetch the kubeadm-config ConfigMap: unexpected error when reading kubeadm-config ConfigMap: ClusterConfiguration key value pair missing
,而您不介意格式相反,这很简单:
Id
grep 'Id:' /tmp/data.txt | cut -f 1 -d '.' | uniq -c
5 2019-02-10 12:00:03
11 2019-02-10 12:00:04
丢掉空白行。
grep
选择点之前的字段(即不包含ms的时间)。
cut
对每次出现的总数进行计数。
(如果文件并非总是按顺序排列,则在uniq
之前可能还需要一个sort
)。
要反转数据并添加符合要求格式的管道,可以通过sed管道输出-类似于:
uniq
答案 1 :(得分:-1)
data.txt
2019-02-10 12:00:03.448|Id: 26102338
2019-02-10 12:00:03.448|Id: 25941418
2019-02-10 12:00:03.449|Id: 25827373
2019-02-10 12:00:03.449|Id: 26102038
2019-02-10 12:00:03.449|Id: 25929358
2019-02-10 12:00:04.426|Id: 25713118
2019-02-10 12:00:04.426|Id: 26076208
2019-02-10 12:00:04.426|Id: 26079643
2019-02-10 12:00:04.426|Id: 26085973
2019-02-10 12:00:04.426|Id: 26090023
2019-02-10 12:00:04.426|Id: 26130133
2019-02-10 12:00:04.426|Id: 25954018
2019-02-10 12:00:04.427|Id: 25951468
2019-02-10 12:00:04.427|Id: 26136148
2019-02-10 12:00:04.427|Id: 26103013
2019-02-10 12:00:04.427|Id: 25806433
2019-02-10 12:00:03.427|Id: 25806433
命令:
grep 'Id:' data.txt | cut -f 1 -d '.' | sort | uniq -c | awk '{print $2" "$3" | "$1}'
在计数之前先进行排序以避免时间戳混乱
输出:
2019-02-10 12:00:03 | 6
2019-02-10 12:00:04 | 11