如何将SQL与我的Python .txt解析器结合使用

时间:2017-07-11 13:13:10

标签: python sql python-3.x

目前我已经为我的工作编写了硬编码字段我需要我的代码才能从.txt文件中获取信息。而不是硬编码我想把字段放在数据库的表中,然后写一个脚本来代替从表中读取它。

当前代码:

import pprint
import re


def field_Extract(fileLines, fieldsArray, delimit):
    for line in fileLines:
        for field in fieldsArray:
            if line.startswith(field):
                key, value = line.split(delimit)
                print(key.rstrip(), " : ", value.strip())


test_file = open('/parse.txt', 'r+')

********************************************************************

THE PART BELOW IS WHERE I WANT THE PROGRAM TO GET IT FROM A TABLE 
ALREADY CREATED BY ME AND FILLED IN WITH THE REQUIRED FIELDS. I'LL LIKE 
TO USE ONE TABLE FOR THIS ALTHOUGH THE FIELDS WILL VARY IN COLUMN FOR 
EACH DATA

*********************************************************************

job1 = ['NUMBER OF INPUT RECORDS', 'RECORDS WRITTEN TO ADJUSTMENT FILE', 'RECORDS WRITTEN TO ERROR FILE',
        'COD RECORDS PASSED CEV VALIDATION',
        'COD RECORDS FAILED CEV VALIDATION', 'ADS RECORDS PASSED CEV VALIDATION', 'ADS RECORDS FAILED CEV VALIDATION',
        'PAR ACCESSORIAL REFUND RECORDS']
job2 = ['6010 TOTAL DELIVERY RECORDS READ', '6050 TOTAL ERROR RECORDS WRITTEN',
        '7025 TOTAL PKG DERIVED DATA ROWS INSERTED', '7035 TOTAL PKG DERIVED DATA ROWS UPDATED',
        '7027 TOTAL ACC DERIVED DATA ROWS INSERTED', '7030 TOTAL DELIVERY ROWS UPDATED', 'TSOURCE DERIVED ZONE',
        'SLI INVALID FOR SAT', 'COUNTS FOR COUNTRY',
        'RECORDS READ', 'ERRORS WRITTEN', 'TKPUGPD INSERTED', 'TPKGUPD UPDATED', 'TACCUPD INSERTED',
        'CEV ASY DROP & WRITTEN', 'ZONE NOT FOUND ERRORS']
job3 = ['6010 TOTAL DELIVERY RECORDS READ', '6050 TOTAL ERROR RECORDS WRITTEN', '6030 TOTAL FCB RECORDS WRITTEN',
        '2317 TOTAL NON GENUINE DUP DELIV SCANS', '2270 TOTAL INVALID RESID/COMMER INDICATOR',
        '8130 TOTAL NON LTR CNTNR WITH 0 WEIGHT',
        '1010 TOTAL SHIPPERS NOT IN CRIS', '1080 TOTAL SHIPPERS ADDRESS NOT IN CRIS',
        '1070 TOTAL SHIPPERS CENTER NOT IN CRIS',
        '2310 TOTAL COD INVALID FOR ORIG CNY/ZONE ', '2308 TOTAL EAM SERVICE DOWNGRADED',
        '2309 TOTAL EAM SAT ACCESSORIALS DROPPED']
job4 = ['6010 SHIPMENT RECORDS READ','6030 FCB RECORDS WRITTEN']
job5 = ['TOTAL PACKAGE RECORDS READ', 'TOTAL TPKGUPD RECORDS UPDATED', 'TOTAL TACCUPD RECORDS INSERTED']

# Jobs start and end strings
jobStartStr = ['BEGINNING N260RV30', 'PROGRAM N260CV10 BEGINNING', 'N260CV08 BEGINNING',
               'PROGRAM N260GV12 BEGINNING', 'PROGRAM N260CW13 BEGINNING']
jobEndStr = ['END OF    N260RV30', 'SUCCESSFUL END N260CV10', 'SUCCESSFUL END N260CV08',
             'SUCCESSFUL END N260GV12', 'SUCCESSFUL END N260CW13']

currentJob = -1
currentJobData = []
startAppending = False
for line in test_file:
    # If job start found, gathar job lines
    if startAppending == True:
        currentJobData.append(line)

    # Get the current job
    for jobStart in jobStartStr:
        if jobStart in line:
            currentJob = jobStartStr.index(jobStart) + 1
            # Set the flag to start gathering job lines
            startAppending = True

    # Set the correct job
    if currentJob == 1:
        job = job1
    elif currentJob == 2:
        job = job2
    elif currentJob == 3:
        job = job3
    elif currentJob == 4:
        job = job4
    elif currentJob == 5:
        job = job5
    else:
        currentJob = -1

    # Check job end using End string list
    for jobEnd in jobEndStr:
        if (currentJob != -1) and (jobEnd in line):
            # As a job end found, stop gathering lines
            startAppending = False
            print('########============ NEW JOB STARTS HERE ===========#########')
            # Execute valid jobs
            if currentJob != -1:
                field_Extract(currentJobData, job, ':')
            # Erase completed job lines
            currentJobData = []
            # Set job to invalid job
            currentJob = -1

我的最终目标是在弄清楚我的第一个问题后将所有值存储到另一个访问表中。

1 个答案:

答案 0 :(得分:0)

要将job1 ... job5列表的内容从数据库中获取到Python列表中,请创建一个具有以下格式的表:

create table jobdata ( 
    job integer not null,
    joblabel text not null,
    constraint pk_jobdata primary key (job, joblabel)
    );

填充此表后,它应如下所示:

+------+------------------------------------+
|  job |              joblabel              |
+------+------------------------------------+
|    1 | NUMBER OF INPUT RECORDS            |
|    1 | RECORDS WRITTEN TO ADJUSTMENT FILE |
|    1 | RECORDS WRITTEN TO ERROR FILE      |
|    1 | COD RECORDS PASSED CEV VALIDATION  |
|    ... etc ...                            |
+------+------------------------------------+

要将数据提取到Python列表中,请创建数据库连接,并使用如下函数:

def get_job_labels(db, jobno):
    curs = db.cursor()
    curs.execute("select joblabel from jobdata where job = %d;" % jobno)
    return [row[0] for row in curs.fetchall()]

其中函数参数是数据库连接对象和作业号,如表的job列中所示。可以将返回的列表直接分配给job1 ... job5变量。

如果您想要或需要从数据库中读取jobStartStrjobEndStr,您可以使用相同的方法。但是,您应该使用不同的表,因为这些表包含逻辑上不同的数据,并且标识符具有不同的数据类型。