Python从列表中提取空白

时间:2017-10-25 14:37:56

标签: python python-3.x matching

我是Python新手,我可能完全错了。

我想从文件中提取项目,然后将它们保存为CSV,这很好,除非有空白我的列表会跳过它然后当显示列表时它会将该列表中的所有项目移动到一个并且项目不再相关。

日志文件中的项目可以通过不同的顺序来阻止我根据订单保存项目。

由于

mylist = []
modelCode = []
vin =[]
color=[]


with open('testfile_test.txt') as input_file:
    for line in input_file:
        if "Car Details" in line:
            split_line = line.split(',')
            for text in split_line:
                if "modelCode"in text:
                    split_line, split_line2 = text.split(' ',1)
                    modelCode.append(split_line2)
                else:
                    modelCode.append("")
                   #for items in modelCode:
                    #print(modelCode)
                    #var modelCodeTitle.append(split_line)
                if "vin"in text:
                    split_line, split_line2 = text.split(' ',1)
                    vin.append(split_line2)
                else:
                    vin.append("")
                if "color"in text:
                    split_line, split_line2 = text.split(' ',1)
                    color.append(split_line2)
                else:
                    color.append("")
with open("newfilename.csv", 'a') as outcsv:   
    #configure writer to write standard csv file
    writer = csv.writer(outcsv, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL, lineterminator='\n')
    writer.writerow(['Vin', 'Model Code', 'Colour', 'Chassis', 'Starting Location', 'Owning Organization', 'Final Organization'])
    #for item in modelCode:
    #for i, val in enumerate(modelCode):
    for a, b, c in zip_longest(vin, modelCode, color):
        #Write item to outcsv
        writer.writerow([a,b,c])

示例输入:

  

Hello World00:00:00.179 INFO [CommandExecutionEngine]启动事务   22.08.2017 00:15:27.549 INFO [COMMAND]汽车详情,additionalID,vin EFG123456789,modelCode NEW XTRAIL MY17

     

22.08.2017 00:15:29.001 INFO [COMMAND]汽车详情,additionalID,chassis 54715,vin ABC123324679,modelCode JUKE FACELIFT,

     

22.08.2017 00:15:35.413 INFO [COMMAND]汽车详情,additionalID,vin ABC258741258,modelCode JUKE FACELIFT

     

22.08.2017 08:10:28.169 INFO [COMMAND]汽车详情,additionalID,chassis 25417,vin KFE456985234,modelCode NEW GALAXY,color BLUE

     

22.08.2017 08:10:28.503 INFO [COMMAND]汽车详情,additionalID,vin BFE874512458,modelCode MONDEO 5D,彩色银

     

22.08.2017 08:10:28.810 INFO [COMMAND]汽车详情,vin ABC123456789,modelCode CONNECT V,彩色银

Desired Output

3 个答案:

答案 0 :(得分:1)

您获得所有这些空白的原因是,每次检查“文本”以查看它的详细信息时,即使您没有移动到新行,也会附加空白。

对于“22.08.2017 00:15:29.001 INFO [COMMAND]汽车详细信息,附加ID,机箱54715,vin ABC123324679,modelCode JUKE FACELIFT,”文本“vin ABC123324679”将导致ABC123324679附加到vin,但它也会导致空白被附加到modelCode和颜色。在添加空白之前,您需要等到整个行中的项目丢失,而不仅仅是当前文本。

对代码的最小改动是使用列表的长度来检测该行是否包含所需的详细信息。

with open('testfile_test.txt') as input_file:
    # Can't use enumerate because we skip blank lines.
    car = 0
    for line in input_file:
        if "Car Details" in line:
            split_line = line.split(',')
            for text in split_line:
                if "modelCode"in text:
                    split_line, split_line2 = text.split(' ',1)
                    modelCode.append(split_line2)
                if "vin"in text:
                    split_line, split_line2 = text.split(' ',1)
                    vin.append(split_line2)
                if "color"in text:
                    split_line, split_line2 = text.split(' ',1)
                    color.append(split_line2)
            if len(modelCode) < car:
                modelCode.append("")
            if len(vin) < car:
                vin.append("")
            if len(color) < car:
                color.append("")
            car += 1

这不是我推荐的方法,只是看看为什么你会在这里得到空白。

以下是我的建议:

import csv
cars = []


with open('testfile_test.txt') as input_file:
    for line in input_file:
        if "Car Details" in line:
            car = {}
            split_line = [s.strip() for s in line.split(',')]
            for text in split_line:
                detail = text.split(' ', 1)
                if len(detail) == 2:
                    car[detail[0]] = detail[1]
            cars.append(car)

with open("newfilename.csv", 'a') as outcsv:   
    # configure writer to write standard csv file
    writer = csv.writer(outcsv, delimiter=',', quotechar='|',
                        quoting=csv.QUOTE_MINIMAL, lineterminator='\n')
    writer.writerow([
        'Vin', 'Model Code', 'Colour', 'Chassis', 'Starting Location',
        'Owning Organization', 'Final Organization'])
    for car in cars:
        writer.writerow([
            car.get('vin', ''), car.get('modelCode', ''),
            car.get('color', '')])

这样

  • 每辆车都由一个包含文件中所有细节的字典表示,这样可以方便地访问单个车辆的详细信息
  • 每个细节都没有特殊情况,因为它们的格式都是一样的。
  • 使用dict.get(),在输出时而非解析时处理缺少的细节。

答案 1 :(得分:0)

我建议使用dictionaries甚至创建一个类,而不是使用三个列表。这样您就可以更好地控制数据,并且您的代码将更容易理解。

试试这个:

# this will be a list of dictionaries
car_details = []


with open('testfile_test.txt') as input_file:

    for line in input_file:

        if "Car Details" in line:

            split_line = line.split(',')

            for text in split_line:

                # this is your dict object
                current_car = {}

                if "modelCode" in text:
                    # You can use index instead of creating a object that you will not use
                    current_car['modelCode'] = text.split(' ',1)[1]
                else:
                    current_car['modelCode'] = ''

                if "vin" in text:
                    current_car['vin'] = text.split(' ',1)[1]
                else:
                    current_car['vin'] = ''

                if "color" in text:
                    current_car['color'] = text.split(' ',1)[1]
                else:
                    current_car['color'] = ''

                car_details.append(current_car)


with open("newfilename.csv", 'a') as outcsv:   

    # configure writer to write standard csv file
    writer = csv.writer(outcsv, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL, lineterminator='\n')
    writer.writerow(['Vin', 'Model Code', 'Colour', 'Chassis', 'Starting Location', 'Owning Organization', 'Final Organization'])

    for car_detail in car_details:
        # Write item to outcsv
        writer.writerow([car_detail['vin'], car_detail['modelCode'], car_detail['Colour']])

答案 2 :(得分:0)

我使用dictionaries更新了代码。在以下代码中,我们不检查某个字段是否存在。

import csv
from itertools import zip_longest

total_list = []
with open('testfile_test.txt') as input_file:
    for line in input_file:
        if "Car Details" in line:
            car_details = dict.fromkeys(['vin','modelCode','color','chassis','Starting Location','Owning Organization','Final Organization'],'')
            split_line = line.split(',')
            for text in split_line[1:]:
                value= text.strip().split(' ',1)
                if len(value)>1:
                    car_details[value[0]]=value[1]
            total_list.append(car_details)

with open("newfilename.csv", 'a') as outcsv:   
    writer = csv.writer(outcsv, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL, lineterminator='\n')
    writer.writerow(['Vin', 'Model Code', 'Colour', 'Chassis', 'Starting Location', 'Owning Organization', 'Final Organization'])
    for detail in total_list:
        writer.writerow([detail['vin'],detail['modelCode'],detail['color'],detail['chassis'],detail['Starting Location'],detail['Owning Organization'],detail['Final Organization']])

此外,由于输入不足,我不知道输入csv文件中的字段Starting LocationOwning OrganizationFinal Organization是如何存在的。因此,请在代码

中的line 8处编辑这三个字段