UNIX:仅提取我需要的信息

时间:2017-02-09 16:59:25

标签: unix awk grep ls

我在文件上有以下内容,我需要将某些内容提取到另一个文件中,以便于分析。

saimptlogi_1~20170208022514~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4403_20170115010230_1.dat
saimptlogi_1~20170208022549~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4189_20170122010240_1.dat
saimptlogi_1~20170208022555~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4403_20170116010200_1.dat
saimptlogi_1~20170208022556~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4189_20170108010210_1.dat
saimptlogi_1~20170208022610~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4147_20170101010223_1.dat
saimptlogi_1~20170208022643~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4189_20170107010206_1.dat
saimptlogi_1~20170208022703~procRTLFHead~~103~RET-0103: generic function processing error~STOREDAYNOTREADYTOBELOAD on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4549_20170126010247_7.dat
saimptlogi_1~20170208022707~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4189_20170114010259_1.dat
saimptlogi_1~20170208022736~procRTLFHead~~103~RET-0103: generic function processing error~DATAUNEXPECTEDSTOREDAY on FHEAD record at line 0000000001 in /oretail/apprms/mmhome/data/in/RTLOG_4403_20170108010211_1.dat

我想提取商店(RTLOG_ 4403 _20170108010211_1)和日期(RTLOG_4403_ >的错20170108 010211_1)到另一个文件,我需要输出如下:

示例:

  • DATAUNEXPECTEDSTOREDAY 4403 20170108
  • STOREDAYNOTREADYTOBELOAD 4549 20170126

我已经开发了一个直接从文件中提取STORE和DATE的命令(RTLOGS),但最好直接从这个日志文件中提取。

我的命令: ls {RTLOG * .failed,RTLOG * .rej} | awk -F'|' '{gsub(“_”,“”); print substr($ 0,7,13),$ 4}'

提前谢谢。

2 个答案:

答案 0 :(得分:0)

即使我真的很喜欢AWK,在这种情况下我会使用sed命令来生成合适的结果:

sed -r 's/^.*error.([A-Z]*).*RTLOG_([0-9]*)_([0-9]{8}).*/\1|\2|\3/'

产生类似这样的东西:

DATAUNEXPECTEDSTOREDAY|4403|20170115
DATAUNEXPECTEDSTOREDAY|4189|20170122
DATAUNEXPECTEDSTOREDAY|4403|20170116
DATAUNEXPECTEDSTOREDAY|4189|20170108
DATAUNEXPECTEDSTOREDAY|4147|20170101
DATAUNEXPECTEDSTOREDAY|4189|20170107
STOREDAYNOTREADYTOBELOAD|4549|20170126
DATAUNEXPECTEDSTOREDAY|4189|20170114
DATAUNEXPECTEDSTOREDAY|4403|20170108

答案 1 :(得分:0)

@Pedro:试试:

<div id="chat" class="chat"></div>

function chat_update(url,tag){
        var callAjax = function(){
            $.ajax({
                method:'get',
                url:url+'.php',
                success:function(data){
                    $("#"+tag).html(data);
                    $('#chat').scrollTop($('#chat')[0].scrollHeight - $('#chat')[0].clientHeight);
                }
            });
        }
        setInterval(callAjax,1000);
}

$(function() {
    chat_update('chat','chat');
});

这里我使用awk的匹配功能和第一次匹配我正在寻找字符串“DATAUNEXPECTEDSTOREDAY | STOREDAYNOTREADYTOBELOAD”,然后检查RSTART和RLENGTH的子串是否存在(RSTART和RLENGTH是匹配时将设置的变量找到一行中的正则表达式,如果是,则将变量A的值放入substr($ 0,RSTART,RLENGTH)。 然后在下一个匹配中检查RTLOG _。* dat以从行获取“RTLOG_4147_20170101010223_1.dat”部分,如果找到该匹配,则使用split将substr($ 0,RSTART,RLENGTH)的值拆分为名为Q的数组,其定界符是“_”。然后打印变量AQ [2] OFS substr(Q [3],1,8)的值,其中Q [2]是数组Q的第二个元素,即4403,4189,依此类推,然后根据OP的请求仅取8 RTLOG_4403_ 20170108010211 _1突出显示的部分中的字母。

现在也添加非单线形式的解决方案。

awk '{match($0,/DATAUNEXPECTEDSTOREDAY|STOREDAYNOTREADYTOBELOAD/);if(substr($0,RSTART,RLENGTH)){A=substr($0,RSTART,RLENGTH)};match($0,/RTLOG_.*\.dat/);if(substr($0,RSTART,RLENGTH)){split(substr($0,RSTART,RLENGTH), Q,"_");print A OFS Q[2] OFS substr(Q[3],1,8)}}'  OFS="|"   Input_file