Python代码从.out文件中提取数据

时间:2019-03-29 03:35:31

标签: python

能告诉我如何从文件中提取数据并将其存储在txt文件中吗?

以下是我的文件:

######################################################################
000003c80e06 - 000-00634-42438-177 - Thu Mar 28 13:17:42 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 13:36:34.445 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .X...B.......k..........-..
 2019-03-24 13:36:34.477 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .X.
 2019-03-24 13:38:55.956 | 10.11.0.24     | Charter Stable |     79 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list?..p:id=..70...uid=00000
 2019-03-24 13:38:58.678 | Charter Stable | 10.11.0.24     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553434738.cl:659.p:id=..70...EBIF.......H..................
 2019-03-24 15:22:47.237 | 10.11.0.24     | Charter Stable |     16 | ESS        | Settings.cr        | .....GR7..&p=1C
 2019-03-24 15:22:47.268 | Charter Stable | 10.11.0.24     |     72 | ESS        | 304 Response       | 304.dd:1553440967.ex:1553527367.lm:1548892141.du:86400.au:1.p:id=71
 2019-03-24 15:23:08.84  | 10.11.0.24     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^HL5|mac000003c80e06|upt1552481982|ipa168493080|pid4|nid14
 2019-03-24 15:23:08.868 | Charter Stable | 10.11.0.24     |     16 | Bin Req    | 202 Response       | 202.p:id=72.....
 2019-03-24 15:25:24.31  | 10.11.0.24     | Charter Stable |     95 | 2Way Proxy | IU DR Download     | GET.http://webdav/dav/IU/iu/1.7/M6412-G345/iu_channelList_dr.dr..p:
 2019-03-24 15:25:24.4   | Charter Stable | 10.11.0.24     |     46 | 2Way Proxy | 304 Response       | 304.dd:1553441124.ex:1525371531.p:id=..73.....
 2019-03-24 19:36:34.749 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .Y...B.......k.............
 2019-03-24 19:36:34.781 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .Y.
 2019-03-25 01:36:35.165 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .Z...B.......k........../..
 2019-03-25 01:36:35.201 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .Z.
 2019-03-25 07:36:35.365 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .[...B.......k..........0..
 2019-03-25 07:36:35.399 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .[.
 2019-03-25 07:50:36.069 | 10.11.0.24     | Charter Stable |     64 | 2Way Proxy | ESC Post           | POST.http://esc/esc/..p:id=..74...p=AAADyA4GXJcPDQADpYgQDIA&v=95

######################################################################
0000136cf429 - 000-03259-07497-171 - Thu Mar 28 13:17:43 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 11:58:30.799 | 10.11.0.25     | Charter Stable |     78 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list..p:id=..72...uid=000013
 2019-03-24 11:58:30.88  | Charter Stable | 10.11.0.25     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553428710.cl:659.p:id=..72...EBIF.......H..................
 2019-03-24 13:44:10.828 | 10.11.0.25     | Charter Stable |     27 | FLUX rDVR  |                    | ....B..................-..
 2019-03-24 13:44:10.861 | Charter Stable | 10.11.0.25     |      3 | FLUX rDVR  |                    | ..
 2019-03-24 16:06:20.963 | 10.11.0.25     | Charter Stable |     16 | ESS        | Settings.cr        | .....IR?..&p=1B
 2019-03-24 16:06:21     | Charter Stable | 10.11.0.25     |     72 | ESS        | 304 Response       | 304.dd:1553443580.ex:1553529980.lm:1548894145.du:86400.au:1.p:id=73
 2019-03-24 16:06:40.07  | 10.11.0.25     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^JL5|mac0000136cf429|upt1552482043|ipa168493081|pid4|nid14
 2019-03-24 16:06:40.1   | Charter Stable | 10.11.0.25     |     16 | Bin Req    | 202 Response       | 202.p:id=74.....
 2019-03-24 16:10:30.97  | 10.11.0.25     | Charter Stable |     95 | 2Way Proxy | IU DR Download     | GET.http://webdav/dav/IU/iu/1.7/M2500-G345/iu_channelList_dr.dr..p:
 2019-03-24 16:10:31.042 | Charter Stable | 10.11.0.25     |     46 | 2Way Proxy | 304 Response       | 304.dd:1553443831.ex:1525371619.p:id=..75.....


######################################################################
000004b23046 - 000-00787-86630-065 - Thu Mar 28 13:17:43 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 13:10:17.474 | 10.11.0.33     | Charter Stable |     78 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list..p:id=..80...uid=000004
 2019-03-24 13:10:17.574 | Charter Stable | 10.11.0.33     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553433017.cl:659.p:id=..80...EBIF.......H..................
 2019-03-24 13:26:28.326 | 10.11.0.33     | Charter Stable |     27 | FLUX rDVR  |                    | .,...B.......k....".....+..
 2019-03-24 13:26:28.362 | Charter Stable | 10.11.0.33     |      3 | FLUX rDVR  |                    | .,.
 2019-03-24 17:01:16.116 | 10.11.0.33     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^QL5|mac000004b23046|upt1552481982|ipa168493089|pid4|nid14
 2019-03-24 17:01:16.146 | Charter Stable | 10.11.0.33     |     16 | Bin Req    | 202 Response       | 202.p:id=81.....
 2019-03-24 17:01:23.446 | 10.11.0.33     | Charter Stable |     16 | ESS        | Settings.cr        | .....R....&p=1C
 2019-03-24 17:01:23.486 | Charter Stable | 10.11.0.33     |     72 | ESS        | 304 Response       | 304.dd:1553446883.ex:1553533283.lm:1552409302.du:86400.au:1.p:id=82


######################################################################
00000337e789 - 000-00539-95401-172 - Thu Mar 28 13:17:43 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 12:34:49.281 | 10.11.0.12     | Charter Stable |     78 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list..p:id=.249...uid=000003
 2019-03-24 12:34:49.398 | Charter Stable | 10.11.0.12     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553430889.cl:659.p:id=.249...EBIF.......H..................
 2019-03-24 13:28:50.076 | 10.11.0.12     | Charter Stable |     27 | FLUX rDVR  |                    | .....B.......k..........-..
 2019-03-24 13:28:50.113 | Charter Stable | 10.11.0.12     |      3 | FLUX rDVR  |                    | ...
 2019-03-24 15:41:37.731 | 10.11.0.12     | Charter Stable |     16 | ESS        | Settings.cr        | ..........&p=1C
 2019-03-24 15:41:37.767 | Charter Stable | 10.11.0.12     |     73 | ESS        | 304 Response       | 304.dd:1553442097.ex:1553528497.lm:1552325617.du:86400.au:1.p:id=25

我要提取的最上面的行如下:

  • 长裙:000003c80e06 单位地址:000-00634-42438-177 执行时间:2019年3月28日星期四13:17:42

  • macaddress:0000136cf429 单位增加:000-03259-07497-171 执行时间:2019年3月28日星期四13:17:43

依文件而定。

1 个答案:

答案 0 :(得分:0)

您可以尝试这样。

我已经以2种格式存储了所需的o / p。作为数组['000003c80e06', '000-00634-42438-177', 'Thu Mar 28 13:17:42 GMT 2019']和字典{'execution_time': 'Thu Mar 28 13:17:42 GMT 2019', 'unit_address': '000-00634-42438-177', 'mac_address': '000003c80e06'}

  

»data.txt

######################################################################
000003c80e06 - 000-00634-42438-177 - Thu Mar 28 13:17:42 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 13:36:34.445 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .X...B.......k..........-..
 2019-03-24 13:36:34.477 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .X.
 2019-03-24 13:38:55.956 | 10.11.0.24     | Charter Stable |     79 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list?..p:id=..70...uid=00000
 2019-03-24 13:38:58.678 | Charter Stable | 10.11.0.24     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553434738.cl:659.p:id=..70...EBIF.......H..................
 2019-03-24 15:22:47.237 | 10.11.0.24     | Charter Stable |     16 | ESS        | Settings.cr        | .....GR7..&p=1C
 2019-03-24 15:22:47.268 | Charter Stable | 10.11.0.24     |     72 | ESS        | 304 Response       | 304.dd:1553440967.ex:1553527367.lm:1548892141.du:86400.au:1.p:id=71
 2019-03-24 15:23:08.84  | 10.11.0.24     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^HL5|mac000003c80e06|upt1552481982|ipa168493080|pid4|nid14
 2019-03-24 15:23:08.868 | Charter Stable | 10.11.0.24     |     16 | Bin Req    | 202 Response       | 202.p:id=72.....
 2019-03-24 15:25:24.31  | 10.11.0.24     | Charter Stable |     95 | 2Way Proxy | IU DR Download     | GET.http://webdav/dav/IU/iu/1.7/M6412-G345/iu_channelList_dr.dr..p:
 2019-03-24 15:25:24.4   | Charter Stable | 10.11.0.24     |     46 | 2Way Proxy | 304 Response       | 304.dd:1553441124.ex:1525371531.p:id=..73.....
 2019-03-24 19:36:34.749 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .Y...B.......k.............
 2019-03-24 19:36:34.781 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .Y.
 2019-03-25 01:36:35.165 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .Z...B.......k........../..
 2019-03-25 01:36:35.201 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .Z.
 2019-03-25 07:36:35.365 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .[...B.......k..........0..
 2019-03-25 07:36:35.399 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .[.
 2019-03-25 07:50:36.069 | 10.11.0.24     | Charter Stable |     64 | 2Way Proxy | ESC Post           | POST.http://esc/esc/..p:id=..74...p=AAADyA4GXJcPDQADpYgQDIA&v=95
  

阅读评论以了解相关语句的实际作用。

     

»extract_new_line.py

with open("data.txt") as f:
    lines = f.readlines()

    # As your line is in 2nd place, it will be lines[1] {0 based index}
    line = lines[1].strip() # strip() is to remove any '\n' from start/end of line

    print(line)
    # 000003c80e06 - 000-00634-42438-177 - Thu Mar 28 13:17:42 GMT 2019

    arr = line.split(" - ")
    print(arr)
    # ['000003c80e06', '000-00634-42438-177', 'Thu Mar 28 13:17:42 GMT 2019']

    # --*-- If you want to store it in a dictionary then you can do like this --*--
    d = {
        "mac_address": arr[0],
        "unit_address": arr[1],
        "execution_time": arr[2]
    }
    print(d)
    # {'execution_time': 'Thu Mar 28 13:17:42 GMT 2019', 'unit_address': '000-00634-42438-177', 'mac_address': '000003c80e06'}

    # --*-- If you want to pretty print the above dict --*--
    import json
    pretty_d = json.dumps(d, indent=4)
    print(pretty_d)
    # {
    #     "execution_time": "Thu Mar 28 13:17:42 GMT 2019", 
    #     "unit_address": "000-00634-42438-177", 
    #     "mac_address": "000003c80e06"
    # }

enter image description here

  

更新:由于新的数据文本包含1条以上的数据行。

现在根据您的规范,这里是data_new.txt的内容。

  

»data_next.txt

######################################################################
000003c80e06 - 000-00634-42438-177 - Thu Mar 28 13:17:42 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 13:36:34.445 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .X...B.......k..........-..
 2019-03-24 13:36:34.477 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .X.
 2019-03-24 13:38:55.956 | 10.11.0.24     | Charter Stable |     79 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list?..p:id=..70...uid=00000
 2019-03-24 13:38:58.678 | Charter Stable | 10.11.0.24     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553434738.cl:659.p:id=..70...EBIF.......H..................
 2019-03-24 15:22:47.237 | 10.11.0.24     | Charter Stable |     16 | ESS        | Settings.cr        | .....GR7..&p=1C
 2019-03-24 15:22:47.268 | Charter Stable | 10.11.0.24     |     72 | ESS        | 304 Response       | 304.dd:1553440967.ex:1553527367.lm:1548892141.du:86400.au:1.p:id=71
 2019-03-24 15:23:08.84  | 10.11.0.24     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^HL5|mac000003c80e06|upt1552481982|ipa168493080|pid4|nid14
 2019-03-24 15:23:08.868 | Charter Stable | 10.11.0.24     |     16 | Bin Req    | 202 Response       | 202.p:id=72.....
 2019-03-24 15:25:24.31  | 10.11.0.24     | Charter Stable |     95 | 2Way Proxy | IU DR Download     | GET.http://webdav/dav/IU/iu/1.7/M6412-G345/iu_channelList_dr.dr..p:
 2019-03-24 15:25:24.4   | Charter Stable | 10.11.0.24     |     46 | 2Way Proxy | 304 Response       | 304.dd:1553441124.ex:1525371531.p:id=..73.....
 2019-03-24 19:36:34.749 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .Y...B.......k.............
 2019-03-24 19:36:34.781 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .Y.
 2019-03-25 01:36:35.165 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .Z...B.......k........../..
 2019-03-25 01:36:35.201 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .Z.
 2019-03-25 07:36:35.365 | 10.11.0.24     | Charter Stable |     27 | FLUX rDVR  |                    | .[...B.......k..........0..
 2019-03-25 07:36:35.399 | Charter Stable | 10.11.0.24     |      3 | FLUX rDVR  |                    | .[.
 2019-03-25 07:50:36.069 | 10.11.0.24     | Charter Stable |     64 | 2Way Proxy | ESC Post           | POST.http://esc/esc/..p:id=..74...p=AAADyA4GXJcPDQADpYgQDIA&v=95

######################################################################
0000136cf429 - 000-03259-07497-171 - Thu Mar 28 13:17:43 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 11:58:30.799 | 10.11.0.25     | Charter Stable |     78 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list..p:id=..72...uid=000013
 2019-03-24 11:58:30.88  | Charter Stable | 10.11.0.25     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553428710.cl:659.p:id=..72...EBIF.......H..................
 2019-03-24 13:44:10.828 | 10.11.0.25     | Charter Stable |     27 | FLUX rDVR  |                    | ....B..................-..
 2019-03-24 13:44:10.861 | Charter Stable | 10.11.0.25     |      3 | FLUX rDVR  |                    | ..
 2019-03-24 16:06:20.963 | 10.11.0.25     | Charter Stable |     16 | ESS        | Settings.cr        | .....IR?..&p=1B
 2019-03-24 16:06:21     | Charter Stable | 10.11.0.25     |     72 | ESS        | 304 Response       | 304.dd:1553443580.ex:1553529980.lm:1548894145.du:86400.au:1.p:id=73
 2019-03-24 16:06:40.07  | 10.11.0.25     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^JL5|mac0000136cf429|upt1552482043|ipa168493081|pid4|nid14
 2019-03-24 16:06:40.1   | Charter Stable | 10.11.0.25     |     16 | Bin Req    | 202 Response       | 202.p:id=74.....
 2019-03-24 16:10:30.97  | 10.11.0.25     | Charter Stable |     95 | 2Way Proxy | IU DR Download     | GET.http://webdav/dav/IU/iu/1.7/M2500-G345/iu_channelList_dr.dr..p:
 2019-03-24 16:10:31.042 | Charter Stable | 10.11.0.25     |     46 | 2Way Proxy | 304 Response       | 304.dd:1553443831.ex:1525371619.p:id=..75.....


######################################################################
000004b23046 - 000-00787-86630-065 - Thu Mar 28 13:17:43 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 13:10:17.474 | 10.11.0.33     | Charter Stable |     78 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list..p:id=..80...uid=000004
 2019-03-24 13:10:17.574 | Charter Stable | 10.11.0.33     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553433017.cl:659.p:id=..80...EBIF.......H..................
 2019-03-24 13:26:28.326 | 10.11.0.33     | Charter Stable |     27 | FLUX rDVR  |                    | .,...B.......k....".....+..
 2019-03-24 13:26:28.362 | Charter Stable | 10.11.0.33     |      3 | FLUX rDVR  |                    | .,.
 2019-03-24 17:01:16.116 | 10.11.0.33     | Charter Stable |     62 | Bin Req    | MacG Reg           | ....^QL5|mac000004b23046|upt1552481982|ipa168493089|pid4|nid14
 2019-03-24 17:01:16.146 | Charter Stable | 10.11.0.33     |     16 | Bin Req    | 202 Response       | 202.p:id=81.....
 2019-03-24 17:01:23.446 | 10.11.0.33     | Charter Stable |     16 | ESS        | Settings.cr        | .....R....&p=1C
 2019-03-24 17:01:23.486 | Charter Stable | 10.11.0.33     |     72 | ESS        | 304 Response       | 304.dd:1553446883.ex:1553533283.lm:1552409302.du:86400.au:1.p:id=82


######################################################################
00000337e789 - 000-00539-95401-172 - Thu Mar 28 13:17:43 GMT 2019
######################################################################

       Packet Time       |     Source     |  Destination   | Length |    Spid    |    Description     |                               Payload 
-------------------------+----------------+----------------+--------+------------+--------------------+---------------------------------------------------------------------
 2019-03-24 12:34:49.281 | 10.11.0.12     | Charter Stable |     78 | 2Way Proxy | App.list DL        | POST.http://qsas:7070/mac-settings/app.list..p:id=.249...uid=000003
 2019-03-24 12:34:49.398 | Charter Stable | 10.11.0.12     |    698 | 2Way Proxy | 200 Response       | 200.dd:1553430889.cl:659.p:id=.249...EBIF.......H..................
 2019-03-24 13:28:50.076 | 10.11.0.12     | Charter Stable |     27 | FLUX rDVR  |                    | .....B.......k..........-..
 2019-03-24 13:28:50.113 | Charter Stable | 10.11.0.12     |      3 | FLUX rDVR  |                    | ...
 2019-03-24 15:41:37.731 | 10.11.0.12     | Charter Stable |     16 | ESS        | Settings.cr        | ..........&p=1C
 2019-03-24 15:41:37.767 | Charter Stable | 10.11.0.12     |     73 | ESS        | 304 Response       | 304.dd:1553442097.ex:1553528497.lm:1552325617.du:86400.au:1.p:id=25
  

»extract_top_line_new.py

import re
import json

def get_dict(line):
    arr = line.split(" - ")

    d = {
        "mac_address": arr[0],
        "unit_address": arr[1],
        "execution_time": arr[2]
    }
    return d


def get_data_list(file_name):
    data_list = []

    with open(file_name) as f:
        text = f.read()
        fiter = fiter = re.finditer(r"######################################################################", text)

        while True:
            try:
                # --- Python2 (Uncomment these 2 lines) ---
                # e = first_hashes_ends_at = fiter.next().end()
                # s = second_hashes_starts_at = fiter.next().start()

                # --- Python 3 (If Python2 then comment the below 2 lines and uncomment above 2)---
                e = first_hashes_ends_at = fiter.__next__().end()
                s = second_hashes_starts_at = fiter.__next__().start()

                line = text[e + 1: s] # This is the final line that we want
                data_list.append(get_dict(line.strip()))
            except StopIteration as err:
                break

    return data_list


if __name__ == "__main__":
    data_list = get_data_list("data_new.txt")

    # pretty printing list of dictionaries
    print(json.dumps(data_list, indent=4))

enter image description here