Question

来自file1.txt

的示例行

2016-04-30 02:03:55,417 INFO  [http-nio-443-exec-51] xxxxxxxxxxxxxx (xxxxxxxxxxxxxx.java:1364)     - TRX[160430120042]::paymentResult::xxxxxxxxxxxxx(Billing)Response::TRANSACTION[160450001042], REFERENCE_CODE[1461953034575], END_USER_ID[tel:+639422387059], OPERATION_RESULT[CHARGED] ERR[ERR_NONE], ERR_MSG[null], RAW_RESPONSE[{ "amountTransaction":{ "serverReferenceCode":"060601159083533", "serviceID":"OX036", "transactionOperationStatus":"Charged", "endUserId":"tel:+639422387059", "referenceCode":"30001-1461953034775", "paymentAmount":{ "totalAmountCharged":"130.00", "chargingInformation":{ "description":"Description of ChargeAmount Request", "currency":"USD", "amount":"130.00" } } } }]

我刚刚在几天前启动了python并且正在环顾堆栈溢出。我似乎无法从仅搜索第一行的文件中循环每一行。

import re
with open('file1.txt') as f:
    my_text = f.read()
    re1='((?:2|1)\\d{3}(?:-|\\/)(?:(?:0[1-9])|(?:1[0-2]))(?:-|\\/)(?:(?:0[1-9])|(?:[1-2][0-9])|(?:3[0-1]))(?:T|\\s)(?:(?:[0-1][0-9])|(?:2[0-3])):(?:[0-5][0-9]):(?:[0-5][0-9]))'    # Time Stamp 1
    re2='.*?'   # Non-greedy match on filler
    re3='\\[.*?\\]' # Uninteresting: sbraces
    re4='.*?'   # Non-greedy match on filler
    re5='\\[.*?\\]' # Uninteresting: sbraces
    re6='.*?'   # Non-greedy match on filler
    re7='(\\[.*?\\])'   # Square Braces 1
    re8='.*?'   # Non-greedy match on filler
    re9='".*?"' # Uninteresting: string
    re10='.*?'  # Non-greedy match on filler
    re11='".*?"'    # Uninteresting: string
    re12='.*?'  # Non-greedy match on filler
    re13='(".*?")'  # Double Quote String 1
    re14='.*?'  # Non-greedy match on filler
    re15='".*?"'    # Uninteresting: string
    re16='.*?'  # Non-greedy match on filler
    re17='(".*?")'  # Double Quote String 2
    re18='.*?'  # Non-greedy match on filler
    re19='".*?"'    # Uninteresting: string
    re20='.*?'  # Non-greedy match on filler
    re21='(".*?")'  # Double Quote String 3
    re22='.*?'  # Non-greedy match on filler
    re23='".*?"'    # Uninteresting: string
    re24='.*?'  # Non-greedy match on filler
    re25='(".*?")'  # Double Quote String 4
    re26='.*?'  # Non-greedy match on filler
    re27='".*?"'    # Uninteresting: string
    re28='.*?'  # Non-greedy match on filler
    re29='(".*?")'  # Double Quote String 5
    re30='.*?'  # Non-greedy match on filler
    re31='".*?"'    # Uninteresting: string
    re32='.*?'  # Non-greedy match on filler
    re33='".*?"'    # Uninteresting: string
    re34='.*?'  # Non-greedy match on filler
    re35='(".*?")'  # Double Quote String 6
    rg = re.compile(re1+re2+re3+re4+re5+re6+re7+re8+re9+re10+re11+re12+re13+re14+re15+re16+re17+re18+re19+re20+re21+re22+re23+re24+re25+re26+re27+re28+re29+re30+re31+re32+re33+re34+re35,re.IGNORECASE|re.DOTALL)
    m = rg.search(my_text)
    if m:
        timestamp1 = m.group(1)
        sbraces1 = m.group(2)
        string1 = m.group(3)
        string2 = m.group(4)
        string3 = m.group(5)
        string4 = m.group(6)
        string5 = m.group(7)
        string6 = m.group(8)
        data = timestamp1 + ',' + sbraces1.strip('[]') + ',' + string1.strip('"') + ',' + string2.strip(
            '"') + ',' + string3.strip('"') + ',' + string4.replace('tel:+', '').strip('"') + ',' + string5.strip(
            '"') + ',' + string6.strip('"')
        print(data)
f.close()

输出

2016-04-30 02:03:55,160430000042,060601159083533,OB056,Charged,639422387059,30001-1461953034575,130.00

Answer 1

循环浏览文件的每一行：

import re
with open('file1.txt') as f:
    for line in f:
        # do something with line

请注意，无需执行f.close()，它已由上下文管理器with ...处理

python没有循环搜索正则表达式

1 个答案: