我有很多要分割的文字。这很困难,因为从技术上讲都是一条线。文本是来自网络设备的未格式化的已记录消息-告诉一条消息在何处结束的唯一方法是,消息始终以'.{5}\d{7}'
开头,例如<186>1093281
。如何读取该字符串,并保存在名为“ textLog”的文件中,并根据该正则表达式将其拆分以形成新的字符串/数组以进行干净输出?
示例输入:
<189>795307: Aug 8 11:41:38 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted<189>795308: Aug 8 11:41:39 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed<189>795309: Aug 8 11:41:45 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted<189>795310: Aug 8 11:41:46 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed<189>795311: Aug 8 11:41:52 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted<189>795312: Aug 8 11:41:53 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed<189>795313: Aug 8 11:41:59 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed<189>795314: Aug 8 11:42:05 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
(其格式为一个长字符串,而不是多行。)
所需的输出:包含...的数组
arr[0]=<189>795307: Aug 8 11:41:38 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
arr[1]=<189>795308: Aug 8 11:41:39 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed
arr[2]=<189>795309: Aug 8 11:41:45 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
...
arr[7]=<189>795314: Aug 8 11:42:05 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
它不必是数组或存储在数据结构中,我最关心的是基于正则表达式进行拆分的方法,以输出或保存子字符串。
答案 0 :(得分:1)
使用GNU sed和Bash 4.0或更高版本:
--upgrade
sed命令查找前6个字符的6位数字(而不是7中暗示的数字)的任何块,并在第一个字符后插入换行符。这不包括在行的开头匹配的字符串,在此我们不想引入换行符。
$ mapfile -t arr < <(sed -E 's/(.)(.{5}[[:digit:]]{6})/\1\n\2/g' infile)
$ printf '%s\n' "${arr[@]}"
<189>795307: Aug 8 11:41:38 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
<189>795308: Aug 8 11:41:39 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed
<189>795309: Aug 8 11:41:45 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
<189>795310: Aug 8 11:41:46 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed
<189>795311: Aug 8 11:41:52 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
<189>795312: Aug 8 11:41:53 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed
<189>795313: Aug 8 11:41:59 EDT: %ILPOWER-5-IEEE_DISCONNECT: Interface Gi1/0/8: PD removed
<189>795314: Aug 8 11:42:05 EDT: %ILPOWER-5-POWER_GRANTED: Interface Gi1/0/8: Power granted
然后通过过程替换将结果读入数组mapfile
中。 arr
语句每行显示一个数组元素。
或者,根据示例输入,您可以使用grep如下分成几行:
printf
这假设每次出现grep -o '<[^<]*' infile
都标志着一个新的日志行。